Data Cleanup
iason
New Altair Community Member
Hello all,
This is my first post and my first attempt to work with actual data on Rapidminer, so please excuse any ignorance.
What I am trying to achieve is cleanup my data, imported from csv files.
First of all, I have a lot of missing values, which show up as ? on the tables. I need a way to keep those out.
Secondly, I have some rules (ie att1*att2 < 5000) and I want to filter the data based on that, preferably without adding an extra column.
I can do all that in a spreadsheet and import clean data in RM, but it would save much time if done internally.
Thank you all in advance.
This is my first post and my first attempt to work with actual data on Rapidminer, so please excuse any ignorance.
What I am trying to achieve is cleanup my data, imported from csv files.
First of all, I have a lot of missing values, which show up as ? on the tables. I need a way to keep those out.
Secondly, I have some rules (ie att1*att2 < 5000) and I want to filter the data based on that, preferably without adding an extra column.
I can do all that in a spreadsheet and import clean data in RM, but it would save much time if done internally.
Thank you all in advance.
Tagged:
0
Answers
-
Hi,
What would you like to filter out? Examples containing any missing value (?) or attributes containing any missing values? For the first, you would use the operator "Filter Examples" with condition "no missing attributes" and for the second you would use the operator "Select Attributes" with filter type "no missing values".
First of all, I have a lot of missing values, which show up as ? on the tables. I need a way to keep those out.
Currently the best option probably is to create such an index colum with the operator "Generate Attributes", filter the examples with "Filter Examples" and remove the index column again with "Select Attributes".
Secondly, I have some rules (ie att1*att2 < 5000) and I want to filter the data based on that, preferably without adding an extra column.
We are actually revising the operator "Filter Examples" for one of the next versions and it will certainly also allow to use expressions like those directly in the operator then.
Cheers,
Ingo0 -
Thank you, problem solved0