Help!! Trying to extract multiline report but not having success
The list below is the section of the report I am trying to extract.
I can get the names and ID to extract fine, but the names without phone numbers are throwing everything off.
1-800-CONFERENCE(R) 801229 |
8002663373 |
PO BOX 5075 N |
SAGINAW MI 48605-5075 USA |
20/20 COMMERCIAL CARE INC 801434 |
9164842020 |
4214 ROSEVILLE RD N |
NORTH HIGHLANDS CA 95660-0000 USA |
3D SYSTEMS INC 83301 |
(800) 889-2964 |
P O BOX 534963 |
ATLANTA GA 30353-4963 USA |
3GEN LLC 63715 |
949-481-6384 |
31521 RANCHO VIEJO ROAD STE 104 |
SAN JOAN CAPISTRANO CA 92675 USA |
3M 800342 |
8003275380 |
SDS6252 N |
PO BOX 269-F N |
ST LOUIS MO 63150-0269 USA |
4 IMPRINT 10 |
877-446-7746 |
101 COMMERCE ST |
PO BOX 320 |
OSHKOSH WI 54901 USA |
6 WEST DESIGN 46526 |
6 W PINE ST N |
LODI CA 94540-0000 USA |
A & L PRODUCTS INC 14 |
(800) 955-8368 |
1900 KINSER RD |
P O BOX 1018 |
CERES CA 95307-1018 USA |
A A PORTABLES INC 12 |
201 ROSCOE RD N |
MODESTO CA 95357-1828 USA |
As you can see there are addresses with 3 and 4 lines and some with and without phone numbers. The file is in cvs format.
Any suggestions.
Answers
-
A cvs file?
Is that CSV? ( I assume it is but you never know with acronyms).
I would have expected a single line per record in that case so this is a slightly strange presentation of the data.
Name and address files are usually a mess and one of the problems here, in addition to missing phone numbers, is poorly formatted numbers.
Two thoughts come to mind based in what you have shown in the post.
Firstly if there is some way to find a method to convert the records to "normal" csv format the problem may be reduced. A CSV format file can be read as a database in Monarch and if it has the correct number of commas (or alternative separator character) things should fall into place. Check the Monarch Utility for some file preparation tools.
If that does not work out another option might be ascertain whether EVERY address includes "USA" at the end.
If it does there might be an option to invert the records, making what is current the last line the first line and so providing a consistent trap option no matter how many lines the address contains. In theory I would expect some clever use of the Address Block feature to then allow you to prepare the addresses for use adding a couple of functions to sort out the phone numbers and other possible anomalies. There is probably some way of doing a full external lookup and cleaning exercise if you have the need. That might depend on how accurate and consistent the file's information proves to be.
HTH.
Grant
0 -
Altair Forum User said:
A cvs file?
Is that CSV? ( I assume it is but you never know with acronyms).
I would have expected a single line per record in that case so this is a slightly strange presentation of the data.
Name and address files are usually a mess and one of the problems here, in addition to missing phone numbers, is poorly formatted numbers.
Two thoughts come to mind based in what you have shown in the post.
Firstly if there is some way to find a method to convert the records to "normal" csv format the problem may be reduced. A CSV format file can be read as a database in Monarch and if it has the correct number of commas (or alternative separator character) things should fall into place. Check the Monarch Utility for some file preparation tools.
If that does not work out another option might be ascertain whether EVERY address includes "USA" at the end.
If it does there might be an option to invert the records, making what is current the last line the first line and so providing a consistent trap option no matter how many lines the address contains. In theory I would expect some clever use of the Address Block feature to then allow you to prepare the addresses for use adding a couple of functions to sort out the phone numbers and other possible anomalies. There is probably some way of doing a full external lookup and cleaning exercise if you have the need. That might depend on how accurate and consistent the file's information proves to be.
HTH.
Grant
Thank you Grant.
Yes it is a csv file. It is not a normal csv. I think it was converted from a proprietary file format we use at work.
As far as I can tell USA does end every address, but like you said the odd number formatting and spacing are causing me issues.
Thanks for the suggestions
Aaron Mandujan
Business Analyst II
San Joaquin General Hospital – General Accounting
Phone 209-468-6050
Email: amandujan@sjgh.org<mailto:amandujan@sjgh.org>
0 -
Altair Forum User said:
Thank you Grant.
Yes it is a csv file. It is not a normal csv. I think it was converted from a proprietary file format we use at work.
As far as I can tell USA does end every address, but like you said the odd number formatting and spacing are causing me issues.
Thanks for the suggestions
Aaron Mandujan
Business Analyst II
San Joaquin General Hospital – General Accounting
Phone 209-468-6050
Email: amandujan@sjgh.org<mailto:amandujan@sjgh.org>
Aaron,
Assuming you cannot trace the source of the file back to a point where it was still a genuine csv structure ...
On the basis of that description you might want to take a look at the file in a text editor that can show you all of the command characters.
I would guess that with a little effort you could find a way convert the file into a "normal" single row per record .csv file and at that point, if achievable, things would probably get easier. Just be sure not to lose any separators along the way, though even then things might not be too bad.
Plan B would be to grab every line in the file in its entirety to a field.
To the resulting table add a calculated field that gives a number (rowno function probably).
Invert sort the entire table. Output to a new file.
Now you can use the line with USA as a first line trap for each address. A multi-line field can be used to capture the rest of the variable number of lines in each record.
What you decide to do after that depends on you objectives for the data and how many records there are along with how many of them have "problems" like missing phone numbers.
Monarch has some rather good name and address handling features where the data is reasonably clean to start with. If it is poor data to start with then there are other features and facilities that could be explored to take it through a cleanup process.
HTH.
Grant
0 -
Altair Forum User said:
Aaron,
Assuming you cannot trace the source of the file back to a point where it was still a genuine csv structure ...
On the basis of that description you might want to take a look at the file in a text editor that can show you all of the command characters.
I would guess that with a little effort you could find a way convert the file into a "normal" single row per record .csv file and at that point, if achievable, things would probably get easier. Just be sure not to lose any separators along the way, though even then things might not be too bad.
Plan B would be to grab every line in the file in its entirety to a field.
To the resulting table add a calculated field that gives a number (rowno function probably).
Invert sort the entire table. Output to a new file.
Now you can use the line with USA as a first line trap for each address. A multi-line field can be used to capture the rest of the variable number of lines in each record.
What you decide to do after that depends on you objectives for the data and how many records there are along with how many of them have "problems" like missing phone numbers.
Monarch has some rather good name and address handling features where the data is reasonably clean to start with. If it is poor data to start with then there are other features and facilities that could be explored to take it through a cleanup process.
HTH.
Grant
Again thank you Grant for the Help.
I ended up using a model that eliminated all the rows that I wouldn’t need then reloaded it back into Monarch and through come creative appends got the file to extract what I needed with only minimal work to fix the few problem rows that remained. Your earlier suggestion led me down the right path.
Thank you again for the help.
Aaron Mandujan
0