Easiest way to remove empty rows and attributes?

User: "kayman"
New Altair Community Member
Updated by Jocelyn

 

Hi,

What would be the easiest way to remove empty columns / rows in a dataset?

My data is imported from source data that I can't easily modify before load so I prefer to deal with it within rapidminer.

 

What I'm looking at is a simple way to remove both Col1 and Row 1 in below example, but in reality I can have more than a few empty rows and columns in much more heavy tables. I've tried to use the filter missing / no missing attributes but it basically removes everything as soon as it finds a missing value so not exactly what I need.

 

  Col1 Col2 Col3
1 ? ? ?
2 ? ? Some data
3 ? ? Some data
4 ? ? Some data
5 ? Some data ?

 

For now I have a fairly heavy workflow where I first create an id, then multiply my data and have one set where I replace all missing values with a 0, everything else with 1 and then use aggregates to sum and remove every row where sum = 0. Next I loop through all my attributes, do something similar and remove all columns where the aggregated sum is 0 again. 

 

It does the trick but seems a bit overkill, so I'm wondering if I'm missing some easy way to deal with this.

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "tftemme"
    New Altair Community Member
    Accepted Answer

    Hi @kayman, Hi @mschmitz,

     

    From my point of view this would be a handy addition to the Toolbox.

    I will have a look, if I find time for this.

     

    Best regards,
    Fabian