regex formula
I have a PDF file and i want to capture a number, On the first row is *01 and on the second row is E1, I want to capture the first number after E1 as below:
*01 NZ 6500
E1 9500
So I want to search for *01 and also on the second row for E1 and to capture "9500" in this example, Do somebody know why my regex is not working?
(?m)^\*01.*\n^\s*E1.*?(\d+)
Thank you in advance,
Ciprian
Answers
-
Text files (or PDF files rendered as text) do not contain line breaks at the end, so you cannot search for those in the regex command, and therefore the (?m) at the beginning has no effect. When capturing data, traps will only work on a single line. You can have a multi line sample, but you need to specify which line in that sample data the trap expression should be looking for.
Keep in mind in terms of parsing and efficiency, Standard trapping is the fastest, followed by floating traps, and then regex traps. Depending on your data, there may be a better way to capture these lines.
- Is the data you are looking to capture always on two lines?
- Maybe you can use two templates: detail and append/footer (depending on the layout)
Are you able to share a larger sample?
0 -
Chris Porthouse said:
Text files (or PDF files rendered as text) do not contain line breaks at the end, so you cannot search for those in the regex command, and therefore the (?m) at the beginning has no effect. When capturing data, traps will only work on a single line. You can have a multi line sample, but you need to specify which line in that sample data the trap expression should be looking for.
Keep in mind in terms of parsing and efficiency, Standard trapping is the fastest, followed by floating traps, and then regex traps. Depending on your data, there may be a better way to capture these lines.
- Is the data you are looking to capture always on two lines?
- Maybe you can use two templates: detail and append/footer (depending on the layout)
Are you able to share a larger sample?
Hi Chris,
Thank you for your fast answer. So i want in the attached example to capture if on the first row is *07 and on the second row is E1 the fist number after E1 and after on the first row is *08 and on again on the second row is e1 the value after E1.
So with a standard group footer on E1 i will capture only the fisrt value, but i will not capture the second value for the month 08.
What do you suggest to do in this case?
0 -
Ciprian Codrea said:
Hi Chris,
Thank you for your fast answer. So i want in the attached example to capture if on the first row is *07 and on the second row is E1 the fist number after E1 and after on the first row is *08 and on again on the second row is e1 the value after E1.
So with a standard group footer on E1 i will capture only the fisrt value, but i will not capture the second value for the month 08.
What do you suggest to do in this case?
If I understand your case, let's use this as an example:
*01 NZ 6500 E1 9500 *07 NZ 6500 E1 7500 *08 NZ 6500 E1 8500
There is a couple of ways you could achieve this, but I will demonstrate the multi-line sample. I am not sure if you are using Monarch Classic or Data Prep Studio, but it works the same in either product.
I would select the two lines that contain the data you are looking to capture and create a detail template. Using a standard trap on the first line, look for *ÑÑ:
You can then define your fields:
0