I want to loop on couples of attributes and create a new one given by an operation on them
erikapastorelli
New Altair Community Member
Hi everyone! I've a problem on Rapidminer 5.Attached is my dataset.
From the fourth column over i have to generate a new attribute starting from the couple x-1,x-0, y-1,y-0.
For example A2A.MI-1 and A2A.MI-0 have to became a new attribute where I got the quotien between them.
How can i solve? I tried loop on attributed and script but I failed.
Thank you very much
Tagged:
0
Answers
-
Hi @erikapastorelli,
Here an example of process using Execute Python operator to perform what you want to do :
<?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="8.1.001" expanded="true" height="68" name="Read Excel" width="90" x="45" y="34">
<parameter key="excel_file" value="C:\Users\Lionel\Documents\Formations_DataScience\Rapidminer\Tests_Rapidminer\Iterate_Attributes\Iterate_Attributes.xlsx"/>
<parameter key="imported_cell_range" value="A1:I4"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<list key="data_set_meta_data_information">
<parameter key="0" value="label.true.integer.attribute"/>
<parameter key="1" value="Date.true.date_time.attribute"/>
<parameter key="2" value="test.true.polynominal.attribute"/>
<parameter key="3" value="A2A\.MI-1.true.real.attribute"/>
<parameter key="4" value="A2A\.MI-0.true.real.attribute"/>
<parameter key="5" value="AGL\.MI-1.true.real.attribute"/>
<parameter key="6" value="AGL\.MI-0.true.real.attribute"/>
<parameter key="7" value="ATL\.MI-1.true.real.attribute"/>
<parameter key="8" value="ATL\.MI-0.true.real.attribute"/>
</list>
</operator>
<operator activated="true" class="python_scripting:execute_python" compatibility="7.4.000" expanded="true" height="82" name="Execute Python" width="90" x="246" y="34">
<parameter key="script" value="import pandas as pd # rm_main is a mandatory function, # the number of arguments has to be the number of input ports (can be none) def rm_main(data): for column in range(3,len(data.columns)+1,2): data["Quotient_" + str(column)] = data.apply(lambda row: row.iloc[column] / row.iloc[column+1], axis=1) # connect 1 output port to see the results return data"/>
</operator>
<connect from_op="Read Excel" from_port="output" to_op="Execute Python" to_port="input 1"/>
<connect from_op="Execute Python" from_port="output 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="90"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>Here the Excel file with an extract of your dataset (which I used to create this process) :
https://drive.google.com/open?id=1s1AvmN0H_zZF3zgi0CnMlFPjar-qMxFD
I hope it helps,
Regards,
Lionel
NB : There is just a problem : after execution of the Python script, the date attribute is set to "?", and i don't know why....
NB2 : I thing it is possible to improve this Python script, to give more explicit name to the generated attributes.
1