import ".names" and ".data" file into rapidminer
pariB
New Altair Community Member
hi every body,
i have a dataSet that has some nominal attributes and i need to convert them to numerical ones. the dataSet is from UCI Machine Learning repository (http://archive.ics.uci.edu/ml/) . it has two type of files( ".data" and ".names" file).
1) how should i import ".data" and ".names" file in rapidminer ?
2) how "nominal to numrical" operator must be used to convert data?
any help will be appreciated
Tagged:
0
Answers
-
hi
the .data file is simply CSV like format comma separated.
So just download the file (you can rename it into something.csv) and use Read CSV operator.
I dont think you want to import .name one its just meta information about the datafile.
To make it even easier for you: just copy and paste this code into your rapidminer XML view(remove the previous xml code), click the green check symbol and switch back to the normal Process (diagram) view.
If you click import wizard on the CSV read operator you can specify type and names of your variables as you please.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="145" width="279">
<operator activated="true" class="read_csv" compatibility="5.2.008" expanded="true" height="60" name="Read CSV" width="90" x="179" y="75">
<parameter key="csv_file" value="C:\iris.data"/>
<parameter key="column_separators" value=","/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations"/>
<parameter key="encoding" value="windows-1252"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="att1.true.real.attribute"/>
<parameter key="1" value="att2.true.real.attribute"/>
<parameter key="2" value="att3.true.real.attribute"/>
<parameter key="3" value="att4.true.real.attribute"/>
<parameter key="4" value="att5.true.polynominal.attribute"/>
</list>
</operator>
<connect from_op="Read CSV" from_port="output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
fpariB wrote:
hi every body,
i have a dataSet that has some nominal attributes and i need to convert them to numerical ones. the dataSet is from UCI Machine Learning repository (http://archive.ics.uci.edu/ml/) . it has two type of files( ".data" and ".names" file).
1) how should i import ".data" and ".names" file in rapidminer ?
2) how "nominal to numrical" operator must be used to convert data?
any help will be appreciated0