regex formula

Ciprian Codrea
Ciprian Codrea Altair Community Member
edited September 5 in Community Q&A

I have a PDF file and i want to capture a number,  On the first row  is *01 and on the second row is E1, I want to capture the first number after E1 as below:

*01        NZ             6500                   
           E1                                9500

So I want to search for *01 and also on the second row for E1 and to capture "9500" in this example, Do somebody know  why my regex is not working?

(?m)^\*01.*\n^\s*E1.*?(\d+)

Thank  you in advance,

Ciprian

 

Tagged:

Answers

  • CPorthouse
    CPorthouse
    Altair Employee
    edited September 5

    Text files (or PDF files rendered as text) do not contain line breaks at the end, so you cannot search for those in the regex command, and therefore the (?m) at the beginning has no effect. When capturing data, traps will only work on a single line.  You can have a multi line sample, but you need to specify which line in that sample data the trap expression should be looking for.

    Keep in mind in terms of parsing and efficiency, Standard trapping is the fastest, followed by floating traps, and then regex traps.  Depending on your data, there may be a better way to capture these lines.

    • Is the data you are looking to capture always on two lines?
    • Maybe you can use two templates: detail and append/footer (depending on the layout)

    Are you able to share a larger sample?

  • Ciprian Codrea
    Ciprian Codrea Altair Community Member
    edited September 5

    Text files (or PDF files rendered as text) do not contain line breaks at the end, so you cannot search for those in the regex command, and therefore the (?m) at the beginning has no effect. When capturing data, traps will only work on a single line.  You can have a multi line sample, but you need to specify which line in that sample data the trap expression should be looking for.

    Keep in mind in terms of parsing and efficiency, Standard trapping is the fastest, followed by floating traps, and then regex traps.  Depending on your data, there may be a better way to capture these lines.

    • Is the data you are looking to capture always on two lines?
    • Maybe you can use two templates: detail and append/footer (depending on the layout)

    Are you able to share a larger sample?

    Hi Chris,

     

    Thank you for your fast answer. So i want in the attached example to capture if on the first row is *07 and on the second row is E1 the fist number after E1 and after on the first row is *08 and on again on the second row is e1 the value after E1.

    So with a standard group footer on E1 i will capture only the fisrt value, but i will not capture the second value for the month 08.

     

    What do you suggest to do in this case?

  • CPorthouse
    CPorthouse
    Altair Employee
    edited September 5

    Hi Chris,

     

    Thank you for your fast answer. So i want in the attached example to capture if on the first row is *07 and on the second row is E1 the fist number after E1 and after on the first row is *08 and on again on the second row is e1 the value after E1.

    So with a standard group footer on E1 i will capture only the fisrt value, but i will not capture the second value for the month 08.

     

    What do you suggest to do in this case?

    If I understand your case, let's use this as an example:

    *01        NZ             6500                               E1                                9500  *07        NZ             6500                               E1                                7500  *08        NZ             6500                               E1                                8500 

    There is a couple of ways you could achieve this, but I will demonstrate the multi-line sample.  I am not sure if you are using Monarch Classic or Data Prep Studio, but it works the same in either product.

    I would select the two lines that contain the data you are looking to capture and create a detail template.  Using a standard trap on the first line, look for *ÑÑ:

    image

    You can then define your fields:

    image