🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Generating new attribute columns from multiple datasets

User: "abulusu"
New Altair Community Member
Updated by Jocelyn
Hello,

I am a new user of rapidminer. I am working with two datasets that have some common attributes such as day, month, year, etc. I am looking to add attribute columns from the second dataset (max_Temp, min_Temp) into the first dataset using the attributes common to both datasets i.e. day, month, year as a reference. I tried all operators in the Set Operations tab (union, join etc) but none resulted in what I am looking for. Can someone please help.

Thanks for your time!

Find more posts tagged with

Sort by:
1 - 2 of 21
    User: "JEdward"
    New Altair Community Member
    It sounds as though the Join operator is the one you need. 

    Have a look at this quick sample process low which shows how to join using multiple attributes as the key.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.5.002">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="6.5.002" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="generate_data_user_specification" compatibility="6.5.002" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="45" y="30">
            <list key="attribute_values">
              <parameter key="Day" value="1"/>
              <parameter key="Month" value="2"/>
              <parameter key="Year" value="2015"/>
              <parameter key="Att1" value="23"/>
              <parameter key="Att2" value="27"/>
            </list>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="generate_data_user_specification" compatibility="6.5.002" expanded="true" height="60" name="Generate Data by User Specification (2)" width="90" x="45" y="165">
            <list key="attribute_values">
              <parameter key="Day" value="1"/>
              <parameter key="Month" value="2"/>
              <parameter key="Year" value="2015"/>
              <parameter key="Att3" value="29"/>
              <parameter key="Att4" value="30"/>
            </list>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="join" compatibility="6.5.002" expanded="true" height="76" name="Join" width="90" x="246" y="120">
            <parameter key="use_id_attribute_as_key" value="false"/>
            <list key="key_attributes">
              <parameter key="Day" value="Day"/>
              <parameter key="Month" value="Month"/>
              <parameter key="Year" value="Year"/>
            </list>
          </operator>
          <connect from_op="Generate Data by User Specification" from_port="output" to_op="Join" to_port="left"/>
          <connect from_op="Generate Data by User Specification (2)" from_port="output" to_op="Join" to_port="right"/>
          <connect from_op="Join" from_port="join" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    User: "abulusu"
    New Altair Community Member
    OP
    Thanks so much! I tried the join operator and specified the common attributes under both datasets and it worked. I just needed to make sure that I had the exact same attribute names for both datasets or else it kept failing.