[SOLVED] Standard Deviation

josh321
josh321 New Altair Community Member
edited November 5 in Community Q&A
I'm aware of operators for average, min, max.. etc.. But I see none for standard deviation. I'm trying to filter data to include only values that are within 3 standard deviations from the mean for a given attribute. How is the best way to go about this in Rapidminer?

Thanks,
Josh
Tagged:

Best Answer

  • dan_agape
    dan_agape New Altair Community Member
    Answer ✓
    Hi Josh,

    Use Generate Attribute to make a copy of the given attribute (assume C is the new attribute), then use Normalize to modify the values of C using the Z-transformation method, and then use Filter Examples to keep  only the rows for which the values of C are between -3 and 3 in the dataset. Finally you can discard the attribute C.

    Dan 

Answers

  • hi
    u can use generate attribute operator to implement the STD formula and then use filter examples operator.

    Or try the operator Weight by deviation.
  • josh321
    josh321 New Altair Community Member
    Hi. Thanks for the reply, but I'm not sure I understand. I've tried the weight by deviation operator, but it appears to weight entire attributes against the data set, rather than a sample vs the attribute mean. And I'm not sure how to implement a STD formula that results in a standard deviation.

    Thanks,
    Josh
  • dan_agape
    dan_agape New Altair Community Member
    Answer ✓
    Hi Josh,

    Use Generate Attribute to make a copy of the given attribute (assume C is the new attribute), then use Normalize to modify the values of C using the Z-transformation method, and then use Filter Examples to keep  only the rows for which the values of C are between -3 and 3 in the dataset. Finally you can discard the attribute C.

    Dan 
  • josh321
    josh321 New Altair Community Member
    That did the trick, thanks!