Help!! Trying to extract multiline report but not having success

Aaron_21459
Aaron_21459 Altair Community Member
edited January 2017 in Community Q&A

The list below is the section of the report I am trying to extract.

 

I can get the names and ID to extract fine, but the names without phone numbers are throwing everything off.

 

 

     1-800-CONFERENCE(R) 801229
8002663373
PO BOX 5075 N
SAGINAW MI 48605-5075      USA
20/20 COMMERCIAL CARE INC 801434
9164842020
4214 ROSEVILLE RD N
NORTH HIGHLANDS CA 95660-0000      USA
3D SYSTEMS INC 83301
(800) 889-2964
P O BOX 534963
ATLANTA GA 30353-4963      USA
3GEN LLC 63715
949-481-6384
31521 RANCHO VIEJO ROAD STE 104
SAN JOAN CAPISTRANO CA 92675           USA
3M 800342
8003275380
SDS6252 N
PO BOX 269-F N
ST LOUIS MO 63150-0269      USA
4 IMPRINT 10
877-446-7746
101 COMMERCE ST
PO BOX 320
OSHKOSH WI 54901           USA
6 WEST DESIGN 46526
6 W PINE ST N
LODI CA 94540-0000      USA
A & L PRODUCTS INC 14
(800) 955-8368
1900 KINSER RD
P O BOX 1018
CERES CA 95307-1018      USA
A A PORTABLES INC 12
201 ROSCOE RD N
MODESTO CA 95357-1828      USA

 

 

As you can see there are addresses with 3 and 4 lines and some with and without phone numbers. The file is in cvs format.

 

Any suggestions.

Tagged:

Answers

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited January 2017

    A cvs file?

     

    Is that CSV? ( I assume it is but you never know with acronyms).

     

    I would have expected a single line per record in that case so this is a slightly strange presentation of the data.

     

    Name and address files are usually a mess and one of the problems here, in addition to missing phone numbers, is poorly formatted numbers.

     

    Two thoughts come to mind based in what you have shown in the post.

     

    Firstly if there is some way to find a method to convert the records to "normal" csv format the problem may be reduced. A CSV format file can be read as a database in Monarch and if it has the correct number of commas (or alternative separator character) things should fall into place. Check the Monarch Utility for some file preparation tools.

     

    If that does not work out another option might be ascertain whether EVERY address includes "USA" at the end.

     

    If it does there might be an option to invert the records, making what is current the last line the first line and so providing a consistent trap option no matter how many lines the address contains. In theory I would expect some clever use of the Address Block feature to then allow you to prepare the addresses for use adding a couple of functions to sort out the phone numbers and other possible anomalies. There is probably some way of doing a full external lookup and cleaning exercise if you have the need. That might depend on how accurate and consistent the file's information proves to be.

     

    HTH.

     

    Grant

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited January 2017

    A cvs file?

     

    Is that CSV? ( I assume it is but you never know with acronyms).

     

    I would have expected a single line per record in that case so this is a slightly strange presentation of the data.

     

    Name and address files are usually a mess and one of the problems here, in addition to missing phone numbers, is poorly formatted numbers.

     

    Two thoughts come to mind based in what you have shown in the post.

     

    Firstly if there is some way to find a method to convert the records to "normal" csv format the problem may be reduced. A CSV format file can be read as a database in Monarch and if it has the correct number of commas (or alternative separator character) things should fall into place. Check the Monarch Utility for some file preparation tools.

     

    If that does not work out another option might be ascertain whether EVERY address includes "USA" at the end.

     

    If it does there might be an option to invert the records, making what is current the last line the first line and so providing a consistent trap option no matter how many lines the address contains. In theory I would expect some clever use of the Address Block feature to then allow you to prepare the addresses for use adding a couple of functions to sort out the phone numbers and other possible anomalies. There is probably some way of doing a full external lookup and cleaning exercise if you have the need. That might depend on how accurate and consistent the file's information proves to be.

     

    HTH.

     

    Grant

    Thank you Grant.

     

    Yes it is a csv file. It is not a normal csv. I think it was converted from a proprietary file format we use at work.

     

    As far as I can tell USA does end every address, but like you said the odd number formatting and spacing are causing me issues.

     

    Thanks for the suggestions

     

    Aaron Mandujan

    Business Analyst II

    San Joaquin General Hospital – General Accounting

    Phone 209-468-6050

    Email: amandujan@sjgh.org<mailto:amandujan@sjgh.org>

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited January 2017

    Thank you Grant.

     

    Yes it is a csv file. It is not a normal csv. I think it was converted from a proprietary file format we use at work.

     

    As far as I can tell USA does end every address, but like you said the odd number formatting and spacing are causing me issues.

     

    Thanks for the suggestions

     

    Aaron Mandujan

    Business Analyst II

    San Joaquin General Hospital – General Accounting

    Phone 209-468-6050

    Email: amandujan@sjgh.org<mailto:amandujan@sjgh.org>

    Aaron,

     

    Assuming you cannot trace the source of the file back to a point where it was still a genuine csv structure ...

     

    On the basis of that description you might want to take a look at the file in a text editor that can show you all of the command characters.

     

    I would guess that with a little effort you could find a way convert the file into a "normal" single row per record .csv file and at that point, if achievable, things would probably get easier. Just be sure not to lose any separators along the way, though even then things might not be too bad.

     

    Plan B would be to grab every line in the file in its entirety to a field.

     

    To the resulting table add a calculated field that gives a number (rowno function probably).

     

    Invert sort the entire table. Output to a new file.

     

    Now you can use the line with USA as a first line trap for each address. A multi-line field can be used to capture the rest of the variable number of lines in each record.

     

    What you decide to do after that depends on you objectives for the data and how many records there are along with how many of them have "problems" like missing phone numbers.

     

    Monarch has some rather good name and address handling features where the data is reasonably clean to start with. If it is poor data to start with then there are other features and facilities that could be explored to take it through a cleanup process.

     

    HTH.

     

     

    Grant

  • Altair Forum User
    Altair Forum User
    Altair Employee
    edited January 2017

    Aaron,

     

    Assuming you cannot trace the source of the file back to a point where it was still a genuine csv structure ...

     

    On the basis of that description you might want to take a look at the file in a text editor that can show you all of the command characters.

     

    I would guess that with a little effort you could find a way convert the file into a "normal" single row per record .csv file and at that point, if achievable, things would probably get easier. Just be sure not to lose any separators along the way, though even then things might not be too bad.

     

    Plan B would be to grab every line in the file in its entirety to a field.

     

    To the resulting table add a calculated field that gives a number (rowno function probably).

     

    Invert sort the entire table. Output to a new file.

     

    Now you can use the line with USA as a first line trap for each address. A multi-line field can be used to capture the rest of the variable number of lines in each record.

     

    What you decide to do after that depends on you objectives for the data and how many records there are along with how many of them have "problems" like missing phone numbers.

     

    Monarch has some rather good name and address handling features where the data is reasonably clean to start with. If it is poor data to start with then there are other features and facilities that could be explored to take it through a cleanup process.

     

    HTH.

     

     

    Grant

    Again thank you Grant for the Help.

     

    I ended up using a model that eliminated all the rows that I wouldn’t need then reloaded it back into Monarch and through come creative appends got the file to extract what I needed with only minimal work to fix the few problem rows that remained. Your earlier suggestion led me down the right path.

     

    Thank you again for the help.

     

    Aaron Mandujan