Xpath multiple records from multiple files
Loky
New Altair Community Member
Hi there,
I'm somehow stuck on something and I guess it is pretty simple to solve.
I have a bunch of html files crawled from web. I process them and use Cut Document so I can extract multiple records from the same file.
Basically, I need two columns from a table found in each of those files. I can get them both, it's no problem there but there is a problem. I lose the correspondence of those records...
As a result I have a WordList, with records from both columns alphabetically sorted.
Ideally would be to have those records, in two separate columns, but the question is: How can I do that?
EDIT:
Looks like if I check the Keep Text on the Process Documents from files I get the data in the example set output too. Same column, but this time there is some kind of correspondence: first row of data is in cell 1 and 2, second row is in 3 and 4...and so on... Now how can i spit them?
EDIT2: It is done. First of all, I copied the Process Document from Files so I'll extract those two rows in two tables. Then using multiple Generate Attribute, generate ID and one Join I managed to make a single table with my values.
After this I used Concatenation to build up the row I need with all the values I extracted.
Just in case someone needs this, this is the solution I found. Best of luck.
Any hit would help.
Thank you in advance.
I'm somehow stuck on something and I guess it is pretty simple to solve.
I have a bunch of html files crawled from web. I process them and use Cut Document so I can extract multiple records from the same file.
Basically, I need two columns from a table found in each of those files. I can get them both, it's no problem there but there is a problem. I lose the correspondence of those records...
As a result I have a WordList, with records from both columns alphabetically sorted.
Ideally would be to have those records, in two separate columns, but the question is: How can I do that?
EDIT:
Looks like if I check the Keep Text on the Process Documents from files I get the data in the example set output too. Same column, but this time there is some kind of correspondence: first row of data is in cell 1 and 2, second row is in 3 and 4...and so on... Now how can i spit them?
EDIT2: It is done. First of all, I copied the Process Document from Files so I'll extract those two rows in two tables. Then using multiple Generate Attribute, generate ID and one Join I managed to make a single table with my values.
After this I used Concatenation to build up the row I need with all the values I extracted.
Just in case someone needs this, this is the solution I found. Best of luck.
Any hit would help.
Thank you in advance.
Tagged:
0
Answers
-
Solution (or workaround, i have no idea) it was found. I posted it in my second EDIT.
Cheers.0