Creating random numbers with user defined distribution?

wasperen
wasperen New Altair Community Member
edited November 5 in Community Q&A
Hi,

Would anyone have a pointer to how to generate random numbers from a user specified distribution?

Thanks,
Willem
Tagged:

Answers

  • wasperen
    wasperen New Altair Community Member
    One option would be to generate a large number of samples outside RapidMiner and (Bootstrappingly) do a sample from that set... Does that make sense?
  • Hello

    Maybe the Generate Attributes operator will help?

    Here's an example...
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.0.11" expanded="true" name="Process">
        <process expanded="true" height="145" width="212">
          <operator activated="true" class="generate_data" compatibility="5.0.11" expanded="true" height="60" name="Generate Data" width="90" x="112" y="75"/>
          <operator activated="true" class="generate_attributes" compatibility="5.0.11" expanded="true" height="76" name="Generate Attributes" width="90" x="308" y="74">
            <list key="function_descriptions">
              <parameter key="myFirstRandomNumber" value="rand()"/>
              <parameter key="mySecondRandomNumber" value="3*rand()+4"/>
            </list>
            <parameter key="keep_all" value="false"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Generate Attributes" to_port="example set input"/>
          <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    regards

    Andrew
  • wasperen
    wasperen New Altair Community Member
    Hi Andrew,

    Thanks for your attention.

    This will indeed generate random numbers. The first one random numbers between 0.0 and 1.0. The second one between 4 and 7.

    The issue I have is that the rand() function generates numbers with a uniform distribution: every number has an equal chance of being returned. But I would like to have a random generator that could, for instance, be more "normally" distributed: numbers closer to the mean will have a higher chance of getting picked. You know, the nice bell curve...

    Or, even better, any user defined curve of that distribution...

    I have not come across such generator in RapidMiner - but I am sure I'm not the only one who is after such a generator.

    Or am I wrong, anyone?

    Thanks,
    Willem
  • Hello Willem

    You can use other mathematical functions in the generate attributes operator. There's also a noise generator operator that I believe adds a gaussian based distribution of noise.

    Andrew