To extract multiple lines from a text document

Altair Forum User
Altair Forum User
Altair Employee
edited September 2017 in Community Q&A

Hi datawatch community members,

     I need to just extract a block of data from a text file and save it into a separate text file.  Is this possible using Monarch?

 

Thank you,

Anu

Tagged:

Answers

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited September 2017

    Anu, when you say you need to "extract a block of data," do you mean to extract it into a table, or to extract the text exactly as it appears?

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited September 2017

    Anu, when you say you need to "extract a block of data," do you mean to extract it into a table, or to extract the text exactly as it appears?

    I want to extract the text exactly as it is and save it into a text file.

     

    Thank you so much for stepping up to help me.  Truly appreciate it.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Anu, when you say you need to "extract a block of data," do you mean to extract it into a table, or to extract the text exactly as it appears?

    Hi Steve,

      Did you have any suggestion for me?

    Thanks,

    Anu

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Anu, Monarch is designed to extract data from reports into tables. It can also export reports, but as far as I know, that exports the entire text file rather than a specific defined section.

     

    For this situation, I would build a Model with Start and End Region Templates defining the boundaries of what you need to extract. Then capture everything in between as the Detail Template. This Detail would need to be a Regular Expression trap so it could capture whether there is anything on the line or not. The expression would be ^(?<Data>.*)$

     

    Then create a field for that capture and ensure the field width is set to 254 so that it will get the entire line. From there, you will have a table with only one column, but that column will contain all the text of the region you need to extract. After that, create an Export, give the file name a .TXT extension and select export file type as delimited text. Finally, run the export and that should give you a text file with only the defined region.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Anu,

     

    Stephen raises a number of interesting questions here.

     

    Firstly, is this one block of data from one report at a time?

     

    Or one block of data from multiple reports in a single process?

     

    Or many blocks of data from a single report?  (Or many blocks from many reports ...?)

     

    Do you need to export to a file or files or do you just need to cut and paste?

     

    Do you have the latest version of Monarch with Data Prep Studio or are you running with an older version?

     

    How do you need to identify the block of data/text that is to be extracted? Is this something you would be doing interactively on a screen or does it need to be a full model and export?

     

    If you can share with us an example of what you are working with (or a mock up to illustrate what it might look like) that would be very useful.

     

     

    Grant

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Hi Steve and Grant,

    Firstly, sorry for being a little late in replying your questions.  Thanks to you both for taking the time to help me out with this :-)  Answers to your questions:

     

    1.  Yes, this is to extract one block of data from one report at a time.

    2.  Not from multiple reports but just one report.

    3.  No, just one block of data from a single report.

    4.  Extract that potion and save it as a text file on a monthly basis.

    5.  Currently having Monarch 9 but will be moving to Monarch v14.2 but not sure if 14 is the latest version.

    6.  I do not want to identify the data in a interactive fashion as the block of data I need always appears in a pre defined place.  So, I want a full model that will just extract all the data between 2 defined regions and just export that to a text file.

    8.  Here is an example

    Input file:

                          NY - STATE 1

    Fact1 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

                       NY - STATE 1

    Fact 2  xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

                    NY - STATE 1

    Fact 3 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact1 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact 2 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact 3 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

     

    I want the output to get all the details for NY - STATE 1 starting from Fact 1 till Fact 3 and save it into a separate text or doc file.  Same for CA.

    Is this possible????

    TIA

    Anu

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Hi Steve and Grant,

    Firstly, sorry for being a little late in replying your questions.  Thanks to you both for taking the time to help me out with this :-)  Answers to your questions:

     

    1.  Yes, this is to extract one block of data from one report at a time.

    2.  Not from multiple reports but just one report.

    3.  No, just one block of data from a single report.

    4.  Extract that potion and save it as a text file on a monthly basis.

    5.  Currently having Monarch 9 but will be moving to Monarch v14.2 but not sure if 14 is the latest version.

    6.  I do not want to identify the data in a interactive fashion as the block of data I need always appears in a pre defined place.  So, I want a full model that will just extract all the data between 2 defined regions and just export that to a text file.

    8.  Here is an example

    Input file:

                          NY - STATE 1

    Fact1 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

                       NY - STATE 1

    Fact 2  xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

                    NY - STATE 1

    Fact 3 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact1 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact 2 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact 3 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

     

    I want the output to get all the details for NY - STATE 1 starting from Fact 1 till Fact 3 and save it into a separate text or doc file.  Same for CA.

    Is this possible????

    TIA

    Anu

    Anu,

     

    Your lines Fact  thru to line 30 of "data" looks like a "Detail" record.

     

    If there are always 30 lines it can be "mapped" in the model as 30 lines.

     

    If the number of lines varies under each fact then a different approach may be required.

     

    The State line looks like an Append record.

     

    Once you have your data table extracted you can filter by the State field to get one group of records at a time.

     

    For export purposes you should be able to set up an export process that groups by the filter and exports for each occurrence of the filter.

     

    However I think you will need to do this as a summary  and whether the summary can be persuaded to work with your 30+ detail lines will depend on how then will ultimately need to be extracted and we have not yet pinned that down as far as I can tell.

     

    The alternative is that lines 1 to 30 are the detail records.

     

    "Fact" is an append record.

     

    "State" is an append record.

     

    A structure like that might be more suited to the use of a summary for the output you need. A little experimentation may be required to assess the better options.

     

     

    Grant

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Hi Steve and Grant,

    Firstly, sorry for being a little late in replying your questions.  Thanks to you both for taking the time to help me out with this :-)  Answers to your questions:

     

    1.  Yes, this is to extract one block of data from one report at a time.

    2.  Not from multiple reports but just one report.

    3.  No, just one block of data from a single report.

    4.  Extract that potion and save it as a text file on a monthly basis.

    5.  Currently having Monarch 9 but will be moving to Monarch v14.2 but not sure if 14 is the latest version.

    6.  I do not want to identify the data in a interactive fashion as the block of data I need always appears in a pre defined place.  So, I want a full model that will just extract all the data between 2 defined regions and just export that to a text file.

    8.  Here is an example

    Input file:

                          NY - STATE 1

    Fact1 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

                       NY - STATE 1

    Fact 2  xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

                    NY - STATE 1

    Fact 3 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact1 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact 2 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

               CA - STATE 2

    Fact 3 xxxxyyyyyyzzzzzzz

    1. data

    2 .data .....

    till

    30. data

     

    I want the output to get all the details for NY - STATE 1 starting from Fact 1 till Fact 3 and save it into a separate text or doc file.  Same for CA.

    Is this possible????

    TIA

    Anu

    This is absolutely possible, as I described above. You will need to create a Model for each section of the report that you need to burst.Bursting Model.png

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    This is absolutely possible, as I described above. You will need to create a Model for each section of the report that you need to burst.Bursting Model.png

    Thanks Steve.

    Is this possible with Monarch Version 9?  I have a very old version :-(

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Thanks Steve.

    Is this possible with Monarch Version 9?  I have a very old version :-(

    Trapping by Regular Expression was not possible in version 9. There may be another way to do something similar using other means of trapping, but I don’t know.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Trapping by Regular Expression was not possible in version 9. There may be another way to do something similar using other means of trapping, but I don’t know.

    Starting from which version is this trapping by regular expression available?

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Starting from which version is this trapping by regular expression available?

    I believe version 13.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    I believe version 13.

    Thanks a lot & Have a great weekend!

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited October 2017

    Thanks Steve.

    Is this possible with Monarch Version 9?  I have a very old version :-(

    Anu,

     

    If each block you need to extract individually starts with the State Identifier Header and ends with line 30 (or at the next State Identifier Header even if it is the same State) then you should be able to do that using V9.

     

    If you want all records for one State in a single extraction that should be possible too but the number of lines it might produce could be a question mark.

     

    You would certainly have more options for some things using version 14 (14.2 is the current release with another point release due soon.)

     

    However there are some things, probably things you do not use, that are different and a few that have been made available in a different way with more recent versions. Whether that might be significant entirely depends how your organization makes use of Monarch at the moment and what they might want to do with it in the future. It's really just a matter of understanding the options and choosing according to needs.

     

    HTH.

     

     

    Grant

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited November 2017

    This is absolutely possible, as I described above. You will need to create a Model for each section of the report that you need to burst.Bursting Model.png

    Hi Steve,

        I have Monarch 14 installed.  I was able to extract the data but need to use the data prep tool to extract the selected rows.  So, looks like it cannot be automated.

     

    Thanks in advance,

    Anu !

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited November 2017

    Hi Steve,

        I have Monarch 14 installed.  I was able to extract the data but need to use the data prep tool to extract the selected rows.  So, looks like it cannot be automated.

     

    Thanks in advance,

    Anu !

    After you write the Detail Template using the Regular Expression supplied, it will highlight everything on the sample line. From there, you need to right-click on that highlighted field and say "Create a Field from this Capture."

    Making Field from REGEX.png

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited November 2017

    After you write the Detail Template using the Regular Expression supplied, it will highlight everything on the sample line. From there, you need to right-click on that highlighted field and say "Create a Field from this Capture."

    Making Field from REGEX.png

    I do not see that Data pop up when I right mouse button click.  Not sure why?

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited November 2017

    I do not see that Data pop up when I right mouse button click.  Not sure why?

    Did you create Detail Template that’s a Regular Expression trap?

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited November 2017

    Did you create Detail Template that’s a Regular Expression trap?

    Yes I did.