"On .CSV or .xls import all values changed to 0"
jcachat
New Altair Community Member
I am trying to import time series data with 1000s of rows - in particular there is a "Time" column, "X" & "Y" coordinate columns and various attributes such as "Velocity".
Every time I import, the values of each row in only the "X" and "Y" columns are changed to 0. Every other value/column is fine...
For example, the import file will read...
Trial time Time X center Y center Velocity Area
1.133 0.099 0.06227948 -0.055351435 0.031315749 2.14E-06
1.166 0.133 0.062465161 -0.056225227 0.02679856 4.75E-06
1.199 0.166 0.062480482 -0.057009747 0.023540315 4.75E-06
1.233 0.199 0.062326724 -0.057716685 0.021704211 9.74E-06
1.266 0.233 0.062005108 -0.05835374 0.021408643 1.02E-05
But when I import it via "Import CSV(or Excel sheet) my Results will be....
Trial time Time X center Y center Velocity
1.133 0.099 0 0 0.031315749
1.166 0.133 0 0 0.02679856
1.199 0.166 0 0 0.023540315
1.233 0.199 0 0 0.021704211
1.266 0.233 0 0 0.021408643
After multiple attempts with .CSV and .XLS files, I cannot get the data imported correctly and any insight is much appreciated. I want to use RapidMiner 5.0, but as it stands only 4.6 will import the data correctly.
I should also note that if I DO NOT "Use First Row as Column Names" the values return....
JC
Every time I import, the values of each row in only the "X" and "Y" columns are changed to 0. Every other value/column is fine...
For example, the import file will read...
Trial time Time X center Y center Velocity Area
1.133 0.099 0.06227948 -0.055351435 0.031315749 2.14E-06
1.166 0.133 0.062465161 -0.056225227 0.02679856 4.75E-06
1.199 0.166 0.062480482 -0.057009747 0.023540315 4.75E-06
1.233 0.199 0.062326724 -0.057716685 0.021704211 9.74E-06
1.266 0.233 0.062005108 -0.05835374 0.021408643 1.02E-05
But when I import it via "Import CSV(or Excel sheet) my Results will be....
Trial time Time X center Y center Velocity
1.133 0.099 0 0 0.031315749
1.166 0.133 0 0 0.02679856
1.199 0.166 0 0 0.023540315
1.233 0.199 0 0 0.021704211
1.266 0.233 0 0 0.021408643
After multiple attempts with .CSV and .XLS files, I cannot get the data imported correctly and any insight is much appreciated. I want to use RapidMiner 5.0, but as it stands only 4.6 will import the data correctly.
I should also note that if I DO NOT "Use First Row as Column Names" the values return....
JC
0
Answers
-
Hi there,
If I save your data, without the column titles, into a CSV and then pass the load through a value guesser then all is well. What puzzles me is that the CSV reader thinks they are integers, whereas the guesser spots them as reals. Anyway, here is what worked for me...
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input>
<location/>
</input>
<output>
<location/>
<location/>
</output>
<macros/>
</context>
<operator activated="true" class="process" expanded="true" name="Process">
<process expanded="true" height="-20" width="-50">
<operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV" width="90" x="18" y="90">
<parameter key="file_name" value="C:\Documents and Settings\Alien\My Documents\rm_workspace\R5 Forum\fdata.csv"/>
<parameter key="use_first_row_as_attribute_names" value="false"/>
</operator>
<operator activated="true" breakpoints="before" class="guess_types" expanded="true" height="76" name="Guess Types" width="90" x="199" y="87"/>
<connect from_op="Read CSV" from_port="output" to_op="Guess Types" to_port="example set input"/>
<connect from_op="Guess Types" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0