How to format a Dataset for Rapidminer

Berettar
Berettar New Altair Community Member
edited November 5 in Community Q&A
Hello there,

i am absolutely new to Rapidminer, so here's my problem i can't really solve atm.
I have a huge dataset inside an excel file ( about 300 rows and 100  columns ), when i try to use this excel file as my input file ( via the "start data loading wizard" ) it doesnt identify properly each column, but divides them in many seperate columns in rapidminer, any suggestions how to solve this easily ?

Thanks ;)

[edit1] now each column gets identified properly ( using the .cvs format type ) , but between each relevant column i now get a column with the string "," nad just can't get this away by modifying my .cvs file, any suggestions ?

[edit2] ok now its getting strange, in column 54 i have a titel where, an starting " and an ending " is given in between the String, so i deleted it, because this costed this column to be seperated into three diffrent coulumns, but when i deleted the two ", i get the described problem from [edit1]. Anyone who knows why this is so ?
When i retype the two " in that column, the problem still exists with described in [edit1], i am confused ?!

[edit3] What also interests me, is how to get the "Input" into the tree view, which is given in the online tutorial, but i can't find it when making an own setup ?!
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    there's a very simple way of loading excel files into rapidminer: Just use the ExcelExampleSource. It will directly read your .xls file without the usual quoting problems of .csv files.

    To insert the ExcelExampleSource into the operator tree, just select it in the new operator tab on the right and drag it into the tree on the left.


    Greetings,
      Sebastian