🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Replace missing value with many subgroup

User: "XiaoHui_0206"
New Altair Community Member
Updated by Jocelyn
Hello! ;)

I'm a new user of RapidMiner and I've encountered an issue while working with some packages. Specifically, I'm trying to replace missing values in my data with the average of the values within the same attribute, but grouped by another attribute. I'd appreciate any assistance in solving this problem. For example, i have

Countries Year Value
Malaysia  2015  1
Malaysia  2014  2
Malaysia  2013  3
Malaysia  2012  4
Malaysia  2011  ?
Malaysia  2010  ?
Malaysia  2009  7
Malaysia  2008  ?
Malaysia  2007  8
Malaysia  2006  9
Malaysia  2005  10
Malaysia  2004 ?

Indonesia 2015 1
Indonesia 2014 2
Indonesia 2013 3
Indonesia 2012 ?
Indonesia 2011  5
Indonesia 2010 6
Indonesia 2009 7
Indonesia 2008 ?
Indonesia 2007 8
Indonesia 2006 9
Indonesia 2005 10
Indonesia 2004 ?

I want to find the average of all countries, I have 190+ countries, but when I use the replace missing value operator, it divides the value by all countries' values, which is not accurate. How I can find the average of all countries by only dividing the particular countries? 
Exp:
Malaysia =(1+2+3+4+7+8+9+10)/8

Here is my dataset, thanks for helping me!  :)

Find more posts tagged with

Sort by:
1 - 5 of 51
    Hi,

    my first intuition would be to use:
    Group into Collection by country
    Loop Collection
    Replace Missing Values inside it

    Best,
    Martin
    User: "XiaoHui_0206"
    New Altair Community Member
    OP
    Hi, can I know how to group by country by using loop collection? :'( After I drag the loop collection into the process. I can't connect my dataset output to the loop collection. There is some error. (Expected IOObjectCollection but received ExampleSet.) And inside the loop collection, I put replace missing value operator, what this operator should connect to?
    User: "Caperez"
    Altair Community Member
    Hi @XiaoHui_0206,

    Other option is using the loop values operator, filtering by attribute, extracting the average of this attribute and finally replacing the average in each group.

    Please find attached a simple example, 

    Best, 

    Cesar
    you need to use group into collection first. Its an operator in operator toolbox extension.

    @ceaperez your solution works, but it gets slow if you have large data sets with many nominals. Just because you have to filter every time.
    Best,
    Martin
    User: "Caperez"
    Altair Community Member
    Good point @MartinLiebig,