Monarch v15.3 - Pipe Delimited File

Todd Klemp
Todd Klemp Altair Community Member
edited January 2022 in Community Q&A

I have a file that is pipe delimited with a header row. The header row indicates that there should be 5 fields of data. However, there are instances where the data in the 3rd field contains a pipe. Therefore, data that should stay in the 3rd field is pushed to fields 4 and 5. Meanwhile the data that is meant for fields 4 and 5 are truncated as there is not an additional field for the data to be pushed to. Here is an example:

Header Row:

Account_number|Comment_code|Description|Post_date|Amount

Sample Data:

1234567890|General|This is an example|1456|Address Line 1|06/07/2021|0

AAAAAA|Insurance|Contact Us|FAQs|Logout My Home Eligibility Claims Resources Switch Provider...|01/08/2021|0

What I would like to do is basically use the pipe delimiter identify each field to determine how many fields need to be created. I haven't been able to figure out how to do that just yet and not sure if this can be done. Any feedback is appreciated. Thanks!

Tagged:

Answers

  • CPorthouse
    CPorthouse
    Altair Employee
    edited January 2022

    Normally in situations like this, the field in question should be surrounded by a text qualifier (double or single quotes, for example) so any delimited file parser would understand that the delimiter is part of the field and should not be considered a delimiter.

    A couple of possibilities to consider:

    1. Ask the file provider to surround the field(s) with a text qualifier
    • 1234567890|General|"This is an example|1456"|Address Line 1|06/07/2021|0
    • "1234567890"|"General"|"This is an example|1456"|"Address Line 1"|"06/07/2021|0"
  • If #1 is not possible, then import the file as a text file instead of a database and then use Monarch (or Data Prep) to extract the various fields using lsplit and rsplit functions.
    • rsplit(lsplit([RAWLINEDATA],3,"|",3),4,"|",4)
    • If I did the formula above correctly it will split the RAWLINEDATA field into 3 parts based on the pipe delimiter and only return the 3rd part: This is an example|1456|Address Line 1|06/07/2021|0
    • and then get passed to the rsplit function to split the rest into 4 parts and return the 4th part: This is an example|1456