CSV reading bug: columns and values are mismatched in reader's output [FALSE BUG
jeanluc
New Altair Community Member
Hello,
I've observed the following bug with build 5.0.001. I have a very simple CSV file (which I'll paste below, it contains some speed measurements from a mobile device). Compare the file and what is shown in the output. Look at the 3rd column in the file, "Download". It contains numerical values. Now look in the image showing what the CSV reader shows. You will see that Download appears Nominal and its range of values is shifted to the next column, "Upload". The range values for the Upload column are listed under the next column (Latency) and so on.
The bug is reproducible every time. As a comparison, the same file converted into an XLS and read with the Excel reader was parsed correctly.
A bug fix would be appreciated as I'll have to read large CSV files that are too large to be read as an XLS.
Thank you!
"Date","Location","Download","Upload","Latency"
05/02/2010 21:39:00,"Date",4070,351,166
05/02/2010 21:38:00,"home",3793,352,164
05/02/2010 21:38:00,"home",4447,350,169
05/02/2010 21:38:00,"home",3595,350,159
05/02/2010 21:37:00,"home",3077,327,1770
05/02/2010 21:37:00,"home",2230,309,259
05/02/2010 11:52:00,"downtown",76,117,219
05/02/2010 11:52:00,"downtown",163,68,205
05/02/2010 11:51:00,"downtown",723,231,186
05/02/2010 11:51:00,"downtown",377,0,270
04/02/2010 21:50:00,"home",2632,327,165
04/02/2010 21:49:00,"home",2803,328,188
04/02/2010 21:49:00,"home",1586,329,276
04/02/2010 21:48:00,"home",2765,357,218
04/02/2010 21:48:00,"home",1634,198,335
04/02/2010 11:43:00,"downtown",692,255,235
04/02/2010 11:43:00,"downtown",602,113,2717
04/02/2010 11:42:00,"downtown",775,56,239
04/02/2010 11:42:00,"downtown",779,312,8148
04/02/2010 11:41:00,"downtown",225,43,221
04/02/2010 11:41:00,"downtown",471,286,3328
03/02/2010 21:50:00,"home",1239,276,4229
03/02/2010 21:49:00,"home",1339,272,2262
03/02/2010 21:48:00,"home",1600,313,197
03/02/2010 21:47:00,"home",2135,313,187
03/02/2010 21:47:00,"home",2026,269,271
03/02/2010 11:50:00,"downtown",711,266,210
03/02/2010 11:50:00,"downtown",152,315,2638
03/02/2010 11:49:00,"downtown",24,249,301
03/02/2010 11:47:00,"downtown",561,291,1740
03/02/2010 11:47:00,"downtown",863,115,213
02/02/2010 21:54:00,"home",1540,351,200
02/02/2010 21:54:00,"home",1493,285,205
02/02/2010 21:53:00,"home",1606,319,194
02/02/2010 21:53:00,"home",1823,319,174
02/02/2010 21:53:00,"home",2150,250,254
02/02/2010 12:07:00,"downtown",472,273,2266
02/02/2010 12:07:00,"downtown",387,267,2736
02/02/2010 12:06:00,"downtown",381,249,280
02/02/2010 12:04:00,"downtown",312,195,3775
02/02/2010 12:03:00,"downtown",863,260,281
02/02/2010 12:02:00,"downtown",405,111,217
01/02/2010 21:36:00,"home",3326,354,183
01/02/2010 21:36:00,"home",3119,326,172
01/02/2010 21:35:00,"home",3677,330,160
01/02/2010 21:35:00,"home",3151,355,182
01/02/2010 21:35:00,"home",3152,314,282
01/02/2010 11:58:00,"downtown",1244,316,1716
01/02/2010 11:58:00,"downtown",1284,312,192
01/02/2010 11:58:00,"downtown",1211,319,206
01/02/2010 11:57:00,"downtown",900,310,208
01/02/2010 11:57:00,"downtown",683,278,5488
I've observed the following bug with build 5.0.001. I have a very simple CSV file (which I'll paste below, it contains some speed measurements from a mobile device). Compare the file and what is shown in the output. Look at the 3rd column in the file, "Download". It contains numerical values. Now look in the image showing what the CSV reader shows. You will see that Download appears Nominal and its range of values is shifted to the next column, "Upload". The range values for the Upload column are listed under the next column (Latency) and so on.
The bug is reproducible every time. As a comparison, the same file converted into an XLS and read with the Excel reader was parsed correctly.
A bug fix would be appreciated as I'll have to read large CSV files that are too large to be read as an XLS.
Thank you!
"Date","Location","Download","Upload","Latency"
05/02/2010 21:39:00,"Date",4070,351,166
05/02/2010 21:38:00,"home",3793,352,164
05/02/2010 21:38:00,"home",4447,350,169
05/02/2010 21:38:00,"home",3595,350,159
05/02/2010 21:37:00,"home",3077,327,1770
05/02/2010 21:37:00,"home",2230,309,259
05/02/2010 11:52:00,"downtown",76,117,219
05/02/2010 11:52:00,"downtown",163,68,205
05/02/2010 11:51:00,"downtown",723,231,186
05/02/2010 11:51:00,"downtown",377,0,270
04/02/2010 21:50:00,"home",2632,327,165
04/02/2010 21:49:00,"home",2803,328,188
04/02/2010 21:49:00,"home",1586,329,276
04/02/2010 21:48:00,"home",2765,357,218
04/02/2010 21:48:00,"home",1634,198,335
04/02/2010 11:43:00,"downtown",692,255,235
04/02/2010 11:43:00,"downtown",602,113,2717
04/02/2010 11:42:00,"downtown",775,56,239
04/02/2010 11:42:00,"downtown",779,312,8148
04/02/2010 11:41:00,"downtown",225,43,221
04/02/2010 11:41:00,"downtown",471,286,3328
03/02/2010 21:50:00,"home",1239,276,4229
03/02/2010 21:49:00,"home",1339,272,2262
03/02/2010 21:48:00,"home",1600,313,197
03/02/2010 21:47:00,"home",2135,313,187
03/02/2010 21:47:00,"home",2026,269,271
03/02/2010 11:50:00,"downtown",711,266,210
03/02/2010 11:50:00,"downtown",152,315,2638
03/02/2010 11:49:00,"downtown",24,249,301
03/02/2010 11:47:00,"downtown",561,291,1740
03/02/2010 11:47:00,"downtown",863,115,213
02/02/2010 21:54:00,"home",1540,351,200
02/02/2010 21:54:00,"home",1493,285,205
02/02/2010 21:53:00,"home",1606,319,194
02/02/2010 21:53:00,"home",1823,319,174
02/02/2010 21:53:00,"home",2150,250,254
02/02/2010 12:07:00,"downtown",472,273,2266
02/02/2010 12:07:00,"downtown",387,267,2736
02/02/2010 12:06:00,"downtown",381,249,280
02/02/2010 12:04:00,"downtown",312,195,3775
02/02/2010 12:03:00,"downtown",863,260,281
02/02/2010 12:02:00,"downtown",405,111,217
01/02/2010 21:36:00,"home",3326,354,183
01/02/2010 21:36:00,"home",3119,326,172
01/02/2010 21:35:00,"home",3677,330,160
01/02/2010 21:35:00,"home",3151,355,182
01/02/2010 21:35:00,"home",3152,314,282
01/02/2010 11:58:00,"downtown",1244,316,1716
01/02/2010 11:58:00,"downtown",1284,312,192
01/02/2010 11:58:00,"downtown",1211,319,206
01/02/2010 11:57:00,"downtown",900,310,208
01/02/2010 11:57:00,"downtown",683,278,5488
Tagged:
0
Answers
-
Hi,
Am I imagining things, or is there a correlation between how new someone is to Rapidminer and how likely they are to declare a bug, falsely? Jeanluc, if you leave spaces as allowable separators what else do you expect? If you leave just a comma in the column separator parameter slot you will find your bug is fixed!
0 -
Of course, there is, beginners are by definition more likely to make mistakes :-)haddock wrote:
Hi,
Am I imagining things, or is there a correlation between how new someone is to Rapidminer and how likely they are to declare a bug, falsely?
Ah, you're correct. Thank you!Jeanluc, if you leave spaces as allowable separators what else do you expect? If you leave just a comma in the column separator parameter slot you will find your bug is fixed! 0