Filter to include rows based on content of rows above and below?

Altair Forum User
Altair Forum User
Altair Employee
edited April 2018 in Community Q&A

Hi everyone,

 

I am a new Monarch user and have a feeling this question may be incredibly obvious, so I apologize in advance if it is.

 

I am using Monarch to search for a list of keywords within a PDF document, and generate a report that includes every line of the PDF document that contains one of those keywords. I am accomplishing this by bringing in each line of the PDF document into a column ("Report"), and then searching each of those lines individually and extracting the keywords into another column ("Keywords") using regexextract. I then filter the "Keywords" column to exclude all rows containing a blank. This leaves me with the "Report" column only showing rows in which a keyword is present, and the "Keywords" column showing me all of the keywords that were found in that row.

 

What I would like to accomplish is to also include the lines preceding and following the lines in which a keyword is present. So, if a keyword is present in line 20 I would like the "Report" column to have lines 19, 20, and 21.

 

Please let me know if you would like further explanation or examples. And thanks in advance!

Tagged:

Answers

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited April 2018

    Hi Chase,

     

    Create a new column using the expression RecNo().  This will sequentially number every row in the table.  Then create two new calculated fields with Recno()-1 and Recno()+1 respectively.

     

    Finally, join these two back to the main data table on Recno() to Recno()-1 and Recno() to Recno()+1

     

    It will be easier to do this in Monarch Data Prep Studio as you can just Copy the table without an interim export.  If you are using Monarch Classic, you would need to export the table to an external file (csv, Excel etc) and bring it back in as 2 External Lookups.  This makes it a 2 stage process and introduces the risk of accidently linking to an old data source. 

    Regards,

    Steve.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited April 2018

    Hi Chase,

     

    Create a new column using the expression RecNo().  This will sequentially number every row in the table.  Then create two new calculated fields with Recno()-1 and Recno()+1 respectively.

     

    Finally, join these two back to the main data table on Recno() to Recno()-1 and Recno() to Recno()+1

     

    It will be easier to do this in Monarch Data Prep Studio as you can just Copy the table without an interim export.  If you are using Monarch Classic, you would need to export the table to an external file (csv, Excel etc) and bring it back in as 2 External Lookups.  This makes it a 2 stage process and introduces the risk of accidently linking to an old data source. 

    Regards,

    Steve.

    Hey Steve,

     

    Thanks for the reply!

     

    I'm not sure I am following your solution. Would you mind explaining it further?


    I have created a new column by adding a calculated field with Recno(). I then added two additional calculated fields with Recno()-1 and Recno+1, respectively. This leaves me with three new columns on my original data table.


    Then what? I'm not sure what you mean by " join these two back to the main data table on Recno() to Recno()-1 and Recno() to Recno()+1".


    My apologies if I am missing something simple, this is my first week using Monarch. I appreciate your help.


    Chase

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited April 2018

    Hey Steve,

     

    Thanks for the reply!

     

    I'm not sure I am following your solution. Would you mind explaining it further?


    I have created a new column by adding a calculated field with Recno(). I then added two additional calculated fields with Recno()-1 and Recno+1, respectively. This leaves me with three new columns on my original data table.


    Then what? I'm not sure what you mean by " join these two back to the main data table on Recno() to Recno()-1 and Recno() to Recno()+1".


    My apologies if I am missing something simple, this is my first week using Monarch. I appreciate your help.


    Chase

    Hi Chase,

     

    No problem.  Are you using Data Prep Studio (the blue cube icon) or Monarch Classic (the white 2d icon)?

     

    Steve

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited April 2018

    Hi Chase,

     

    No problem.  Are you using Data Prep Studio (the blue cube icon) or Monarch Classic (the white 2d icon)?

     

    Steve

    I am using the Data Prep Studio.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited April 2018

    Hey Steve,

     

    Thanks for the reply!

     

    I'm not sure I am following your solution. Would you mind explaining it further?


    I have created a new column by adding a calculated field with Recno(). I then added two additional calculated fields with Recno()-1 and Recno+1, respectively. This leaves me with three new columns on my original data table.


    Then what? I'm not sure what you mean by " join these two back to the main data table on Recno() to Recno()-1 and Recno() to Recno()+1".


    My apologies if I am missing something simple, this is my first week using Monarch. I appreciate your help.


    Chase

    I think Steve was thinking along these lines, but he can correct me if needed.  Since you are using Data Prep Studio, once you added those new columns, duplicate that table.  This will give you 2 identical tables.  For this example, I will call the original table "Master" and the duplicated one "Lookup"  You can then use the "Transform Data" tab at the top to join both of these tables together.  Put the "master" table on the left and the "lookup" table on the right.  Assuming you want to keep all the rows from "Master" and only pull in the rows where there is a match, I would use a left outer join.  Select the right key to be your current row (or whatever you named the formula field recno()) and the right key is the formula field that contains the recno()-1 field.  Under the advanced options for the lookup table, deselect all rows and then chose the field that you want to pull in.

    You are then going to combine the new join table (on the left) with the lookup table (on the right).  Repeat the above steps but instead of matching current row to previous row, you are going to match current row to next row.  Your join structure should look similar to this when you are done:

    You can then hide those formula field columns used for the lookup (recno cols).  Hope this helps.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited April 2018

    I think Steve was thinking along these lines, but he can correct me if needed.  Since you are using Data Prep Studio, once you added those new columns, duplicate that table.  This will give you 2 identical tables.  For this example, I will call the original table "Master" and the duplicated one "Lookup"  You can then use the "Transform Data" tab at the top to join both of these tables together.  Put the "master" table on the left and the "lookup" table on the right.  Assuming you want to keep all the rows from "Master" and only pull in the rows where there is a match, I would use a left outer join.  Select the right key to be your current row (or whatever you named the formula field recno()) and the right key is the formula field that contains the recno()-1 field.  Under the advanced options for the lookup table, deselect all rows and then chose the field that you want to pull in.

    You are then going to combine the new join table (on the left) with the lookup table (on the right).  Repeat the above steps but instead of matching current row to previous row, you are going to match current row to next row.  Your join structure should look similar to this when you are done:

    You can then hide those formula field columns used for the lookup (recno cols).  Hope this helps.

    Yep - Thanks Chris; that's exactly what I was thinking.

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited April 2018

    I think Steve was thinking along these lines, but he can correct me if needed.  Since you are using Data Prep Studio, once you added those new columns, duplicate that table.  This will give you 2 identical tables.  For this example, I will call the original table "Master" and the duplicated one "Lookup"  You can then use the "Transform Data" tab at the top to join both of these tables together.  Put the "master" table on the left and the "lookup" table on the right.  Assuming you want to keep all the rows from "Master" and only pull in the rows where there is a match, I would use a left outer join.  Select the right key to be your current row (or whatever you named the formula field recno()) and the right key is the formula field that contains the recno()-1 field.  Under the advanced options for the lookup table, deselect all rows and then chose the field that you want to pull in.

    You are then going to combine the new join table (on the left) with the lookup table (on the right).  Repeat the above steps but instead of matching current row to previous row, you are going to match current row to next row.  Your join structure should look similar to this when you are done:

    You can then hide those formula field columns used for the lookup (recno cols).  Hope this helps.

    stevecaiels and Chris,

     

    Thank you for your help, gentleman. This solution is exactly what I needed.

     

    Chase