[SOLVED] Loop attributes
Jagsus
New Altair Community Member
Hi,
I am working with a dataset containing 60 attributes, each having 700 examples (and no missing values).
Attribute types are either real oder integer.
10 attributes (allways the same ones) are "normal", the remaining 50 are potential "label" attributes.
What I want to do is pick 1 of the remaining 50 attributes, set it as label, run the now 11 attributes through a Linear Regression and store the result.
My problem is that I dont want to do this 50 times manually (select a new label attribute, rename the store operator and run the process).
So my question is: Is it possible to automatically loop through the 50 potential labels and store each result file in the same folder (ideally having the name of the label) with the help of a loop operator (so i will only have to start the process once and get 50 result files out of that).
Ps.: I have allready seen the loop operators in RM, but did not have time to test things out yet.
I am working with a dataset containing 60 attributes, each having 700 examples (and no missing values).
Attribute types are either real oder integer.
10 attributes (allways the same ones) are "normal", the remaining 50 are potential "label" attributes.
What I want to do is pick 1 of the remaining 50 attributes, set it as label, run the now 11 attributes through a Linear Regression and store the result.
My problem is that I dont want to do this 50 times manually (select a new label attribute, rename the store operator and run the process).
So my question is: Is it possible to automatically loop through the 50 potential labels and store each result file in the same folder (ideally having the name of the label) with the help of a loop operator (so i will only have to start the process once and get 50 result files out of that).
Ps.: I have allready seen the loop operators in RM, but did not have time to test things out yet.
Tagged:
0
Answers
-
Yes, that's possible with the Loop Attributes operator.
Best,
Marius0 -
Hey, thanks so far!
But I can't seem do get it working.
Could anyone explain a little more detailed (or give me a tutorial) on how the loop operators work?
Edit: to be more precise: I selected a subset of attributes in the Loop Attributes operator and now want it to set the one it has selected right in this loop as label0 -
Nobody? I am trying for 3 hours now and I just have no clue how I can get this working.
0 -
Hi there,
Searching this forum for 'loop attributes' may help, but here's a link that may be useful anyway.
http://rapid-i.com/rapidforum/index.php/topic,2351.msg9346.html#msg9346
Hope so.0 -
Thanks, I'll check that during the day and give a response when I'm done!0
-
Sadly, this did not help me. (Maybe because my Rapidminer Skill are absolutely on a beginner level)
What I managed to build is the following (still not working)<?xml version="1.0" encoding="UTF-8" standalone="no"?>
Idea behind this:
<process version="5.1.017">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
<process expanded="true" height="100" width="413">
<operator activated="true" class="retrieve" compatibility="5.1.017" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//AlDi/data/01_mahr_mhrc"/>
</operator>
<operator activated="true" class="loop_attributes" compatibility="5.1.017" expanded="true" height="60" name="Loop Attributes" width="90" x="246" y="30">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="|vtang|sigmaN|WzT|WstT|Wear|Rzini|HRC"/>
<parameter key="invert_selection" value="true"/>
<process expanded="true" height="892" width="1088">
<operator activated="true" class="set_role" compatibility="5.1.017" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
<parameter key="name" value="%{loop_attribute}"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles">
<parameter key="Name" value="id"/>
</list>
</operator>
<operator activated="true" class="retrieve" compatibility="5.1.017" expanded="true" height="60" name="Retrieve (2)" width="90" x="162" y="177">
<parameter key="repository_entry" value="//AlDi/data/01_mahr_mhrc"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.1.017" expanded="true" height="76" name="Select Attributes" width="90" x="313" y="165">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="|HRC|Name|vtang|sigmaN|WzT|WstT|Wear|Rzini"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.017" expanded="true" height="76" name="Set Role (2)" width="90" x="487" y="160">
<parameter key="name" value="Name"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="join" compatibility="5.1.017" expanded="true" height="76" name="Join" width="90" x="715" y="30">
<list key="key_attributes"/>
</operator>
<connect from_port="example set" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Join" to_port="left"/>
<connect from_op="Retrieve (2)" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Set Role (2)" to_port="example set input"/>
<connect from_op="Set Role (2)" from_port="example set output" to_op="Join" to_port="right"/>
<connect from_op="Join" from_port="join" to_port="example set"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Loop Attributes" to_port="example set"/>
<connect from_op="Loop Attributes" from_port="example set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
First i take my dataset and select all possible lable attributes, give them into the loop and set one as lable.
At the same time I load the dataset a second time within the loop, select all the "all time variables" and join them to a number of new datasets equal to the possible lable attributes, each set containig one different lable and the same "all timers". But this process just runs until I force-close the program.
I'm sorry for all the trouble my lack of skills might cost and I really appreciate your help!
Ps.: I did a forum search for "loop attributes" but didn't find something helpfull (at leat from my point of view)0 -
Hey guys,
I have tried several times over the week but I still allways condemned what I had because nothing worked.
It may sound a little desperate at this point, but as I start to feel incredibly stupid each time I open Rapidminer because the result is probably pretty simple and I just can't figure it out, so: Could someone please pass me some XML-Code that might work on the problem described in post 1?
That would be really awesome!
Ps.: And yes, I have searched the forums and google and work with the Loop Attributes and Work on Subset operators0 -
Hey, managed the following but get the error: Duplicate Attribute Role: label. Any ideas?
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.000" expanded="true" name="Process">
<process expanded="true" height="600" width="614">
<operator activated="true" class="generate_data" compatibility="5.2.000" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
<operator activated="true" class="loop_attributes" compatibility="5.2.000" expanded="true" height="60" name="Loop Attributes" width="90" x="313" y="30">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="|att1|att2|att3|att4|att5"/>
<parameter key="iteration_macro" value="loop"/>
<process expanded="true" height="618" width="614">
<operator activated="true" class="work_on_subset" compatibility="5.2.000" expanded="true" height="76" name="Work on Subset" width="90" x="45" y="30">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="%{loop}"/>
<process expanded="true" height="618" width="663">
<operator activated="true" class="set_role" compatibility="5.2.000" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
<parameter key="name" value="%{loop}"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<connect from_port="exampleSet" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_port="example set"/>
<portSpacing port="source_exampleSet" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
</operator>
<operator activated="true" class="linear_regression" compatibility="5.2.000" expanded="true" height="94" name="Linear Regression" width="90" x="180" y="30"/>
<connect from_port="example set" to_op="Work on Subset" to_port="example set"/>
<connect from_op="Work on Subset" from_port="example set" to_op="Linear Regression" to_port="training set"/>
<connect from_op="Linear Regression" from_port="exampleSet" to_port="example set"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
</process>
</operator>
<operator activated="true" class="remember" compatibility="5.2.000" expanded="true" height="60" name="Remember" width="90" x="478" y="35">
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator activated="true" class="recall" compatibility="5.2.000" expanded="true" height="60" name="Recall" width="90" x="514" y="165">
<parameter key="name" value="model"/>
<parameter key="io_object" value="ExampleSet"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Loop Attributes" to_port="example set"/>
<connect from_op="Loop Attributes" from_port="example set" to_op="Remember" to_port="store"/>
<connect from_op="Recall" from_port="result" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0 -
Hi,
You can do it with loop attributes and macros, like this.<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.001" expanded="true" name="Root">
<description></description>
<parameter key="random_seed" value="2000"/>
<process expanded="true" height="390" width="634">
<operator activated="true" class="retrieve" compatibility="5.2.001" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Polynomial"/>
</operator>
<operator activated="true" class="generate_id" compatibility="5.2.001" expanded="true" height="76" name="Generate ID" width="90" x="179" y="30"/>
<operator activated="true" class="generate_data" compatibility="5.2.001" expanded="true" height="60" name="Generate Data" width="90" x="45" y="300">
<parameter key="number_examples" value="200"/>
<parameter key="number_of_attributes" value="50"/>
</operator>
<operator activated="true" class="generate_id" compatibility="5.2.001" expanded="true" height="76" name="Generate ID (2)" width="90" x="45" y="165"/>
<operator activated="true" class="rename_by_replacing" compatibility="5.2.001" expanded="true" height="76" name="Rename by Replacing" width="90" x="179" y="165">
<parameter key="replace_what" value="att"/>
<parameter key="replace_by" value="label_"/>
</operator>
<operator activated="true" class="join" compatibility="5.2.001" expanded="true" height="76" name="Join" width="90" x="313" y="30">
<list key="key_attributes"/>
</operator>
<operator activated="true" class="set_macro" compatibility="5.2.001" expanded="true" height="76" name="Set Macro" width="90" x="313" y="165">
<parameter key="macro" value="label"/>
<parameter key="value" value="label"/>
</operator>
<operator activated="true" class="loop_attributes" compatibility="5.2.001" expanded="true" height="60" name="Loop Attributes" width="90" x="447" y="165">
<parameter key="attribute_filter_type" value="regular_expression"/>
<parameter key="regular_expression" value="label_.*|att.*"/>
<process expanded="true" height="371" width="661">
<operator activated="true" class="exchange_roles" compatibility="5.2.001" expanded="true" height="76" name="Exchange Roles" width="90" x="112" y="30">
<parameter key="first_attribute" value="%{label}"/>
<parameter key="second_attribute" value="%{loop_attribute}"/>
</operator>
<operator activated="true" class="set_macro" compatibility="5.2.001" expanded="true" height="76" name="Set Macro (2)" width="90" x="241" y="27">
<parameter key="macro" value="label"/>
<parameter key="value" value="%{loop_attribute}"/>
</operator>
<operator activated="true" class="work_on_subset" compatibility="5.2.001" expanded="true" height="76" name="Work on Subset" width="90" x="380" y="30">
<parameter key="attribute_filter_type" value="regular_expression"/>
<parameter key="regular_expression" value="%{label}|a.*"/>
<parameter key="include_special_attributes" value="true"/>
<process expanded="true" height="371" width="661">
<operator activated="true" class="provide_macro_as_log_value" compatibility="5.2.001" expanded="true" height="76" name="Provide Macro as Log Value" width="90" x="112" y="30">
<parameter key="macro_name" value="%{label}"/>
</operator>
<operator activated="true" class="extract_macro" compatibility="5.2.001" expanded="true" height="60" name="Extract Macro" width="90" x="246" y="30">
<parameter key="macro" value="atts"/>
<parameter key="macro_type" value="number_of_attributes"/>
</operator>
<operator activated="true" class="log" compatibility="5.2.001" expanded="true" height="76" name="Log" width="90" x="380" y="30">
<list key="log">
<parameter key="Label" value="operator.Loop Attributes.value.feature_name"/>
<parameter key="Atts" value="operator.Extract Macro.value.macro_value"/>
</list>
</operator>
<connect from_port="exampleSet" to_op="Provide Macro as Log Value" to_port="through 1"/>
<connect from_op="Provide Macro as Log Value" from_port="through 1" to_op="Extract Macro" to_port="example set"/>
<connect from_op="Extract Macro" from_port="example set" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="example set"/>
<portSpacing port="source_exampleSet" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
</operator>
<connect from_port="example set" to_op="Exchange Roles" to_port="example set input"/>
<connect from_op="Exchange Roles" from_port="example set output" to_op="Set Macro (2)" to_port="through 1"/>
<connect from_op="Set Macro (2)" from_port="through 1" to_op="Work on Subset" to_port="example set"/>
<connect from_op="Work on Subset" from_port="example set" to_port="example set"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log_to_data" compatibility="5.2.001" expanded="true" height="94" name="Log to Data" width="90" x="514" y="30"/>
<connect from_op="Retrieve" from_port="output" to_op="Generate ID" to_port="example set input"/>
<connect from_op="Generate ID" from_port="example set output" to_op="Join" to_port="left"/>
<connect from_op="Generate Data" from_port="output" to_op="Generate ID (2)" to_port="example set input"/>
<connect from_op="Generate ID (2)" from_port="example set output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_op="Join" to_port="right"/>
<connect from_op="Join" from_port="join" to_op="Set Macro" to_port="through 1"/>
<connect from_op="Set Macro" from_port="through 1" to_op="Loop Attributes" to_port="example set"/>
<connect from_op="Loop Attributes" from_port="example set" to_op="Log to Data" to_port="through 1"/>
<connect from_op="Log to Data" from_port="exampleSet" to_port="result 1"/>
<connect from_op="Log to Data" from_port="through 1" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
As an aside I would you urge to read the documentation, and work through the examples, rather than Google for an answer.
0 -
Thanks! With my current knowledge i would never have gotten so far.
Just one (hopefully) last thing:
Could you explain what happens inside the "Work on subset" operator? The rest is clear to me.0 -
Hi there,
You can see that the process just thins down the attribute set to 1 label and the same 5 attributes, and logs that; so just before that log operator you can do your learning, or optimisation, or whatever. If you put a break before the log operator you can see the example set that would be available, like this.<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.001" expanded="true" name="Root">
<parameter key="random_seed" value="2000"/>
<process expanded="true" height="390" width="634">
<operator activated="true" class="retrieve" compatibility="5.2.001" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="//Samples/data/Polynomial"/>
</operator>
<operator activated="true" class="generate_id" compatibility="5.2.001" expanded="true" height="76" name="Generate ID" width="90" x="179" y="30"/>
<operator activated="true" class="generate_data" compatibility="5.2.001" expanded="true" height="60" name="Generate Data" width="90" x="45" y="300">
<parameter key="number_examples" value="200"/>
<parameter key="number_of_attributes" value="50"/>
</operator>
<operator activated="true" class="generate_id" compatibility="5.2.001" expanded="true" height="76" name="Generate ID (2)" width="90" x="45" y="165"/>
<operator activated="true" class="rename_by_replacing" compatibility="5.2.001" expanded="true" height="76" name="Rename by Replacing" width="90" x="179" y="165">
<parameter key="replace_what" value="att"/>
<parameter key="replace_by" value="label_"/>
</operator>
<operator activated="true" class="join" compatibility="5.2.001" expanded="true" height="76" name="Join" width="90" x="313" y="30">
<list key="key_attributes"/>
</operator>
<operator activated="true" class="set_macro" compatibility="5.2.001" expanded="true" height="76" name="Set Macro" width="90" x="313" y="165">
<parameter key="macro" value="label"/>
<parameter key="value" value="label"/>
</operator>
<operator activated="true" class="loop_attributes" compatibility="5.2.001" expanded="true" height="60" name="Loop Attributes" width="90" x="447" y="165">
<parameter key="attribute_filter_type" value="regular_expression"/>
<parameter key="regular_expression" value="label_.*|att.*"/>
<process expanded="true" height="371" width="661">
<operator activated="true" class="exchange_roles" compatibility="5.2.001" expanded="true" height="76" name="Exchange Roles" width="90" x="112" y="30">
<parameter key="first_attribute" value="%{label}"/>
<parameter key="second_attribute" value="%{loop_attribute}"/>
</operator>
<operator activated="true" class="set_macro" compatibility="5.2.001" expanded="true" height="76" name="Set Macro (2)" width="90" x="241" y="27">
<parameter key="macro" value="label"/>
<parameter key="value" value="%{loop_attribute}"/>
</operator>
<operator activated="true" class="work_on_subset" compatibility="5.2.001" expanded="true" height="76" name="Work on Subset" width="90" x="380" y="30">
<parameter key="attribute_filter_type" value="regular_expression"/>
<parameter key="regular_expression" value="%{label}|a.*"/>
<parameter key="include_special_attributes" value="true"/>
<process expanded="true" height="371" width="661">
<operator activated="true" class="provide_macro_as_log_value" compatibility="5.2.001" expanded="true" height="76" name="Provide Macro as Log Value" width="90" x="112" y="30">
<parameter key="macro_name" value="%{label}"/>
</operator>
<operator activated="true" class="extract_macro" compatibility="5.2.001" expanded="true" height="60" name="Extract Macro" width="90" x="246" y="30">
<parameter key="macro" value="atts"/>
<parameter key="macro_type" value="number_of_attributes"/>
</operator>
<operator activated="true" breakpoints="before" class="log" compatibility="5.2.001" expanded="true" height="76" name="Log" width="90" x="514" y="75">
<list key="log">
<parameter key="Label" value="operator.Loop Attributes.value.feature_name"/>
<parameter key="Atts" value="operator.Extract Macro.value.macro_value"/>
</list>
</operator>
<connect from_port="exampleSet" to_op="Provide Macro as Log Value" to_port="through 1"/>
<connect from_op="Provide Macro as Log Value" from_port="through 1" to_op="Extract Macro" to_port="example set"/>
<connect from_op="Extract Macro" from_port="example set" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="example set"/>
<portSpacing port="source_exampleSet" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
</operator>
<connect from_port="example set" to_op="Exchange Roles" to_port="example set input"/>
<connect from_op="Exchange Roles" from_port="example set output" to_op="Set Macro (2)" to_port="through 1"/>
<connect from_op="Set Macro (2)" from_port="through 1" to_op="Work on Subset" to_port="example set"/>
<connect from_op="Work on Subset" from_port="example set" to_port="example set"/>
<portSpacing port="source_example set" spacing="0"/>
<portSpacing port="sink_example set" spacing="0"/>
</process>
</operator>
<operator activated="true" class="log_to_data" compatibility="5.2.001" expanded="true" height="94" name="Log to Data" width="90" x="514" y="30"/>
<connect from_op="Retrieve" from_port="output" to_op="Generate ID" to_port="example set input"/>
<connect from_op="Generate ID" from_port="example set output" to_op="Join" to_port="left"/>
<connect from_op="Generate Data" from_port="output" to_op="Generate ID (2)" to_port="example set input"/>
<connect from_op="Generate ID (2)" from_port="example set output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_op="Join" to_port="right"/>
<connect from_op="Join" from_port="join" to_op="Set Macro" to_port="through 1"/>
<connect from_op="Set Macro" from_port="through 1" to_op="Loop Attributes" to_port="example set"/>
<connect from_op="Loop Attributes" from_port="example set" to_op="Log to Data" to_port="through 1"/>
<connect from_op="Log to Data" from_port="exampleSet" to_port="result 1"/>
<connect from_op="Log to Data" from_port="through 1" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>0 -
Allright, think this can get flagged as solved. Thank you for your efforts0