regex formula

Altair Community Member

Sep 5, 2024

Updated Sep 5, 2024 by Ciprian Codrea

I have a PDF file and i want to capture a number, On the first row is *01 and on the second row is E1, I want to capture the first number after E1 as below:

*01 NZ 6500
E1 9500

So I want to search for *01 and also on the second row for E1 and to capture "9500" in this example, Do somebody know why my regex is not working?

(?m)^\*01.*\n^\s*E1.*?(\d+)

Thank you in advance,

Ciprian

Find more posts tagged with

English

Monarch

Sort by:

1 - 3 of 31

CPorthouse

Altair Employee

Sep 5, 2024

Updated Sep 5, 2024 by CPorthouse

Text files (or PDF files rendered as text) do not contain line breaks at the end, so you cannot search for those in the regex command, and therefore the (?m) at the beginning has no effect. When capturing data, traps will only work on a single line. You can have a multi line sample, but you need to specify which line in that sample data the trap expression should be looking for.

Keep in mind in terms of parsing and efficiency, Standard trapping is the fastest, followed by floating traps, and then regex traps. Depending on your data, there may be a better way to capture these lines.

Is the data you are looking to capture always on two lines?
Maybe you can use two templates: detail and append/footer (depending on the layout)

Are you able to share a larger sample?

Ciprian Codrea

Altair Community Member

Sep 5, 2024

Updated Sep 5, 2024 by Ciprian Codrea

Text files (or PDF files rendered as text) do not contain line breaks at the end, so you cannot search for those in the regex command, and therefore the (?m) at the beginning has no effect. When capturing data, traps will only work on a single line. You can have a multi line sample, but you need to specify which line in that sample data the trap expression should be looking for.

Keep in mind in terms of parsing and efficiency, Standard trapping is the fastest, followed by floating traps, and then regex traps. Depending on your data, there may be a better way to capture these lines.

Is the data you are looking to capture always on two lines?

Maybe you can use two templates: detail and append/footer (depending on the layout)

Are you able to share a larger sample?

Hi Chris,

Thank you for your fast answer. So i want in the attached example to capture if on the first row is *07 and on the second row is E1 the fist number after E1 and after on the first row is *08 and on again on the second row is e1 the value after E1.

So with a standard group footer on E1 i will capture only the fisrt value, but i will not capture the second value for the month 08.

What do you suggest to do in this case?

capture 1.png

CPorthouse

Altair Employee

Sep 5, 2024

Updated Sep 5, 2024 by CPorthouse

Hi Chris,

Thank you for your fast answer. So i want in the attached example to capture if on the first row is *07 and on the second row is E1 the fist number after E1 and after on the first row is *08 and on again on the second row is e1 the value after E1.

So with a standard group footer on E1 i will capture only the fisrt value, but i will not capture the second value for the month 08.

What do you suggest to do in this case?

If I understand your case, let's use this as an example:

*01        NZ             6500                               E1                                9500  *07        NZ             6500                               E1                                7500  *08        NZ             6500                               E1                                8500

There is a couple of ways you could achieve this, but I will demonstrate the multi-line sample. I am not sure if you are using Monarch Classic or Data Prep Studio, but it works the same in either product.

I would select the two lines that contain the data you are looking to capture and create a detail template. Using a standard trap on the first line, look for *ÑÑ:

You can then define your fields:

regex formula

Find more posts tagged with

Quick Links