Add Random Missing Data points

btibert
btibert New Altair Community Member
edited November 2024 in Community Q&A
I am sure this is possible, but what is the best way to add missing data to a dataset?  I want add noise and save out the dataset for my class to explore and handle.  
Tagged:

Best Answer

  • sgenzer
    sgenzer
    Altair Employee
    edited September 2019 Answer ✓
    @btibert have you tried the "Add Noise" operator? It's not exactly what you're looking for but I think you could use it to fill the need.


Answers

  • varunm1
    varunm1 New Altair Community Member
    Hello @btibert

    Is this data related to a general problem or time series problem? If this is a general problem, imputing missing values (operator available) based on an algorithm like KNN is suitable and for time series you can go with replacing missing values operator with mean or replace missing values (series) operator with linear interpolation are suitable.
  • sgenzer
    sgenzer
    Altair Employee
    edited September 2019 Answer ✓
    @btibert have you tried the "Add Noise" operator? It's not exactly what you're looking for but I think you could use it to fill the need.


  • MartinLiebig
    MartinLiebig
    Altair Employee
    i usually go for generate attribute with:
    if(rand()<0.2,MISSING_NUMERICAL,value)

    Cheers,
    Martin
  • btibert
    btibert New Altair Community Member
    Thanks Scott.  I suppose I could get there via multiple splits and declare missing value paths (and then append/union), but good to know about the Noise Operator because I was not aware.  Thanks!
  • sgenzer
    sgenzer
    Altair Employee
    glad we could help @btibert. @mschmitz @varunm1 nice solutions as well. I always love to see how many ways you can tackle a problem in RapidMiner. :smile:

    Scott