import ".names" and ".data" file into rapidminer

pariB
pariB New Altair Community Member
edited November 5 in Community Q&A

hi every body,

i have a dataSet that has some nominal attributes and i need to convert them to numerical ones. the dataSet is from  UCI  Machine Learning repository (http://archive.ics.uci.edu/ml/) . it has two type of files( ".data" and ".names" file).
1) how should i import ".data" and ".names" file in rapidminer ?
2) how "nominal to numrical" operator must be used to convert data?

any help will be appreciated
Tagged:

Answers

  • hi
    the .data file is simply CSV like format comma separated.

    So just download the file (you can rename it into something.csv) and use Read CSV operator.
    I dont think you want to import .name one its just meta information about the datafile.

    To make it even easier for you: just copy and paste this code into your rapidminer XML view(remove the previous xml code), click the green check symbol and switch back to the normal Process (diagram) view.

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.008">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
       <process expanded="true" height="145" width="279">
         <operator activated="true" class="read_csv" compatibility="5.2.008" expanded="true" height="60" name="Read CSV" width="90" x="179" y="75">
           <parameter key="csv_file" value="C:\iris.data"/>
           <parameter key="column_separators" value=","/>
           <parameter key="first_row_as_names" value="false"/>
           <list key="annotations"/>
           <parameter key="encoding" value="windows-1252"/>
           <list key="data_set_meta_data_information">
             <parameter key="0" value="att1.true.real.attribute"/>
             <parameter key="1" value="att2.true.real.attribute"/>
             <parameter key="2" value="att3.true.real.attribute"/>
             <parameter key="3" value="att4.true.real.attribute"/>
             <parameter key="4" value="att5.true.polynominal.attribute"/>
           </list>
         </operator>
         <connect from_op="Read CSV" from_port="output" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>

    If you click import wizard on the CSV read operator you can specify type and names of your variables as you please.

    f
    pariB wrote:

    hi every body,

    i have a dataSet that has some nominal attributes and i need to convert them to numerical ones. the dataSet is from  UCI  Machine Learning repository (http://archive.ics.uci.edu/ml/) . it has two type of files( ".data" and ".names" file).
    1) how should i import ".data" and ".names" file in rapidminer ?
    2) how "nominal to numrical" operator must be used to convert data?

    any help will be appreciated