Skipping comments in CSV loop import
tatianiia
New Altair Community Member
Hi! I experience the following problem:
I am using "Loop Files" and "Read CSV" to import a set of .csv files to RapidMiner.
Each file has roughly the following format:
none1,none2,none3
var1,var2,var3,var4
1,2,3,4
1,2,3,4
I want to skip the first line, so I annotate it as comment, and the second one - as name.
However, although the first line is not present in the output, it seems that the number of columns is equal to its capacity - so, there are 3 columns in the output instead of expected 4. So, this is my problem.
I read that one of the workarounds is naming every required column in dataset metadata information, but here I come across another obstacle: my .csv files have different number of attributes and I don't see any way to create a unified list of columns in metadata information that will fit them all.
Are there any solutions for that? I have not found any, so I will be grateful for your help.
I am using "Loop Files" and "Read CSV" to import a set of .csv files to RapidMiner.
Each file has roughly the following format:
none1,none2,none3
var1,var2,var3,var4
1,2,3,4
1,2,3,4
I want to skip the first line, so I annotate it as comment, and the second one - as name.
However, although the first line is not present in the output, it seems that the number of columns is equal to its capacity - so, there are 3 columns in the output instead of expected 4. So, this is my problem.
I read that one of the workarounds is naming every required column in dataset metadata information, but here I come across another obstacle: my .csv files have different number of attributes and I don't see any way to create a unified list of columns in metadata information that will fit them all.
Are there any solutions for that? I have not found any, so I will be grateful for your help.
Tagged:
0
Answers
-
I have removed the first line with ''Remove document parts" operator, so everythings works fine now. But I still we be glad to hear some useful comments, since this solution does not seem to be ideal.0
-
Hi,
unfortunately that is a bug. It is on our list and will be fixed in the future.
Regards,
Marco0