Counting occurrences of values

Ritika
Ritika New Altair Community Member
edited November 5 in Altair RapidMiner
Hi! I want to count the number of times a certain value appears in an attribute and print it out for each value. I tried using the aggregate operator but I don't think it looks for and counts specific values. Also, I would really appreciate it if the answer could be posted with an explanation/screenshot of the actual process in RapidMiner instead of code.
Tagged:

Comments

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Hi @Ritika

    Did you set the "Group by attributes" parameter ?

    Regards,

    Lionel

  • Ritika
    Ritika New Altair Community Member
    Hi Lionel,

    I used "group by attributes" to select the specific attribute where my values are. This only separates the entire attribute though. There are multiple values within this attribute -- I want to count the number of times each value occurs.
  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    @Ritika,

    Please take a look at the process in attached file and tell me if it answers to your need...
    If yes, adapt it to your process.
    If not, please share your dataset and explain more explicitly what you want to achieve ...

    Regards,

    Lionel 
  • Ritika
    Ritika New Altair Community Member
    I'm unable to open the file in Rapid Miner; I get an error that the file is malformed. Sorry about this. Is there another way you could send it to me?
  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    @Ritika,

    Please copy the code here, and paste it in your XML panel and click on the green mark. (the process will appear in the main window) : 
    <?xml version="1.0" encoding="UTF-8"?><process version="9.9.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.9.002" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.9.002" expanded="true" height="68" name="Retrieve Golf" width="90" x="179" y="85">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="aggregate" compatibility="9.9.002" expanded="true" height="82" name="Aggregate" width="90" x="380" y="85">
            <parameter key="use_default_aggregation" value="false"/>
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="default_aggregation_function" value="average"/>
            <list key="aggregation_attributes">
              <parameter key="Outlook" value="count"/>
            </list>
            <parameter key="group_by_attributes" value="Outlook"/>
            <parameter key="count_all_combinations" value="false"/>
            <parameter key="only_distinct" value="false"/>
            <parameter key="ignore_missings" value="true"/>
          </operator>
          <connect from_op="Retrieve Golf" from_port="output" to_op="Aggregate" to_port="example set input"/>
          <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>

    Regards,

    Lionel


  • Ritika
    Ritika New Altair Community Member
    Yes, I was able to adapt to my process! Thank you so much!
  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    You're welcome ! 

    good luck ! 

    Regards,

    Lionel