Easiest way to remove empty rows and attributes?
Hi,
What would be the easiest way to remove empty columns / rows in a dataset?
My data is imported from source data that I can't easily modify before load so I prefer to deal with it within rapidminer.
What I'm looking at is a simple way to remove both Col1 and Row 1 in below example, but in reality I can have more than a few empty rows and columns in much more heavy tables. I've tried to use the filter missing / no missing attributes but it basically removes everything as soon as it finds a missing value so not exactly what I need.
Col1 | Col2 | Col3 | |
1 | ? | ? | ? |
2 | ? | ? | Some data |
3 | ? | ? | Some data |
4 | ? | ? | Some data |
5 | ? | Some data | ? |
For now I have a fairly heavy workflow where I first create an id, then multiply my data and have one set where I replace all missing values with a 0, everything else with 1 and then use aggregates to sum and remove every row where sum = 0. Next I loop through all my attributes, do something similar and remove all columns where the aggregated sum is 0 again.
It does the trick but seems a bit overkill, so I'm wondering if I'm missing some easy way to deal with this.