🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Easiest way to remove empty rows and attributes?

User: "kayman"
New Altair Community Member
Updated by Jocelyn

 

Hi,

What would be the easiest way to remove empty columns / rows in a dataset?

My data is imported from source data that I can't easily modify before load so I prefer to deal with it within rapidminer.

 

What I'm looking at is a simple way to remove both Col1 and Row 1 in below example, but in reality I can have more than a few empty rows and columns in much more heavy tables. I've tried to use the filter missing / no missing attributes but it basically removes everything as soon as it finds a missing value so not exactly what I need.

 

  Col1 Col2 Col3
1 ? ? ?
2 ? ? Some data
3 ? ? Some data
4 ? ? Some data
5 ? Some data ?

 

For now I have a fairly heavy workflow where I first create an id, then multiply my data and have one set where I replace all missing values with a 0, everything else with 1 and then use aggregates to sum and remove every row where sum = 0. Next I loop through all my attributes, do something similar and remove all columns where the aggregated sum is 0 again. 

 

It does the trick but seems a bit overkill, so I'm wondering if I'm missing some easy way to deal with this.

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "tftemme"
    New Altair Community Member
    Accepted Answer

    Hi @kayman, Hi @mschmitz,

     

    From my point of view this would be a handy addition to the Toolbox.

    I will have a look, if I find time for this.

     

    Best regards,
    Fabian