Import Data

Ellie98
Ellie98 New Altair Community Member
edited November 5 in Community Q&A
Hello, 
I have difficulties to import the Data from my teacher to work with it. Could someone help me out please? Do I have to import a read file first? 
I have to import and work with the Iris Data but it's not working


Thanks in advance!

Best Answer

  • kayman
    kayman New Altair Community Member
    Answer ✓
    Hi @Ellie98, That's probably normal, as we're 'faking' a csv file here.

    Just try as below, but change the path to your dat file : 
    It works for my dat files, but that doesn't mean it works for everything of course. But given that a dat file is in essence just a tab separated file it should give at least something.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.8.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.8.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="read_csv" compatibility="9.8.001" expanded="true" height="68" name="Read CSV" width="90" x="313" y="34">
            <parameter key="csv_file" value="path_to_your_dat"/>
            <parameter key="column_separators" value="\t"/>
            <parameter key="trim_lines" value="false"/>
            <parameter key="use_quotes" value="false"/>
            <parameter key="quotes_character" value="&quot;"/>
            <parameter key="escape_character" value="\"/>
            <parameter key="skip_comments" value="false"/>
            <parameter key="comment_characters" value="#"/>
            <parameter key="starting_row" value="1"/>
            <parameter key="parse_numbers" value="true"/>
            <parameter key="decimal_character" value="."/>
            <parameter key="grouped_digits" value="false"/>
            <parameter key="grouping_character" value=","/>
            <parameter key="infinity_representation" value=""/>
            <parameter key="date_format" value=""/>
            <parameter key="first_row_as_names" value="false"/>
            <list key="annotations"/>
            <parameter key="time_zone" value="ECT"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="encoding" value="UTF-8"/>
            <parameter key="read_all_values_as_polynominal" value="true"/>
            <list key="data_set_meta_data_information"/>
            <parameter key="read_not_matching_values_as_missings" value="true"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>




Answers

  • kayman
    kayman New Altair Community Member
    Hi @Ellie98, You should be able to use dat files if you import them using the read csv operator, and set the column separator to tab (\t).
    Don't use the wizard, just plane stupid import with the above settings. This (should) result in a structured format to work with in Rapidminer.
  • Ellie98
    Ellie98 New Altair Community Member
    Hey @kayman
    if I try to open the file with Read CSV I can't see the file anymore
  • kayman
    kayman New Altair Community Member
    Answer ✓
    Hi @Ellie98, That's probably normal, as we're 'faking' a csv file here.

    Just try as below, but change the path to your dat file : 
    It works for my dat files, but that doesn't mean it works for everything of course. But given that a dat file is in essence just a tab separated file it should give at least something.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.8.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.8.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="read_csv" compatibility="9.8.001" expanded="true" height="68" name="Read CSV" width="90" x="313" y="34">
            <parameter key="csv_file" value="path_to_your_dat"/>
            <parameter key="column_separators" value="\t"/>
            <parameter key="trim_lines" value="false"/>
            <parameter key="use_quotes" value="false"/>
            <parameter key="quotes_character" value="&quot;"/>
            <parameter key="escape_character" value="\"/>
            <parameter key="skip_comments" value="false"/>
            <parameter key="comment_characters" value="#"/>
            <parameter key="starting_row" value="1"/>
            <parameter key="parse_numbers" value="true"/>
            <parameter key="decimal_character" value="."/>
            <parameter key="grouped_digits" value="false"/>
            <parameter key="grouping_character" value=","/>
            <parameter key="infinity_representation" value=""/>
            <parameter key="date_format" value=""/>
            <parameter key="first_row_as_names" value="false"/>
            <list key="annotations"/>
            <parameter key="time_zone" value="ECT"/>
            <parameter key="locale" value="English (United States)"/>
            <parameter key="encoding" value="UTF-8"/>
            <parameter key="read_all_values_as_polynominal" value="true"/>
            <list key="data_set_meta_data_information"/>
            <parameter key="read_not_matching_values_as_missings" value="true"/>
            <parameter key="datamanagement" value="double_array"/>
            <parameter key="data_management" value="auto"/>
          </operator>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
        </process>
      </operator>
    </process>




  • Ellie98
    Ellie98 New Altair Community Member
    @kayman I'm not sure what to do. Should I copy paste the code somewhere? If yes, where?
  • kayman
    kayman New Altair Community Member
    @Ellie98
    open a new project and select menu -> View -> Show Panel -> XML

    This provides the raw code behind the scenes. You remove everything you see (shouldn't be much as it's a new project window)
    Paste the code above and tick the green mark in the upper left corner of the XML window, this 'saves' the code as project
    Move back to your process window, now everything should be visible in friendly format again

  • Ellie98
    Ellie98 New Altair Community Member
    @kayman
     okay, I got it. It works now, thank you so much! :smiley: