Loop data sets and dynamically generated file path

Serek91
New Altair Community Member
Hi,
I have subprocess with Write CSV operator. It is multiplied ~70 times. Output file has path like "{category_id}/{set_id}/filename.csv" So I want to have it dynamically generated. Can I create it somehow? Like putting to the subprocess two custom variables and then using it in filepath?
EDIT:
I'm using Loop Datasets operator. But after each iteration I have to somehow obtain index of current iteration and generate filepath...
Process added as attachment.
Tagged:
1
Best Answer
-
Hi,
Just define a macro before the Loop, and then use it inside the Loop and increment it each iteration. Some Loops do it for you, but for your Loop you have to do it yourself. I would also recommend to check out Loop Collection, then you don't have to define 70 connections to the Loop Data Sets operator.. Anyway, It's quite easy, see the little example below:<?xml version="1.0" encoding="UTF-8"?><process version="9.4.001-SNAPSHOT"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.4.001-SNAPSHOT" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.4.001-SNAPSHOT" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34"> <parameter key="repository_entry" value="//Samples/data/Iris"/> </operator> <operator activated="true" class="set_macro" compatibility="9.4.001-SNAPSHOT" expanded="true" height="82" name="Prepare counter" width="90" x="179" y="34"> <parameter key="macro" value="i"/> <parameter key="value" value="1"/> </operator> <operator activated="true" class="multiply" compatibility="9.4.001-SNAPSHOT" expanded="true" height="124" name="Multiply" width="90" x="313" y="34"/> <operator activated="true" class="loop_data_sets" compatibility="9.4.001-SNAPSHOT" expanded="true" height="124" name="Loop Data Sets" width="90" x="447" y="34"> <parameter key="only_best" value="false"/> <process expanded="true"> <operator activated="true" class="store" compatibility="9.4.001-SNAPSHOT" expanded="true" height="68" name="Store" width="90" x="179" y="34"> <parameter key="repository_entry" value="%{i} - myData"/> </operator> <operator activated="true" class="generate_macro" compatibility="9.4.001-SNAPSHOT" expanded="true" height="82" name="Increment counter" width="90" x="380" y="34"> <list key="function_descriptions"> <parameter key="i" value="eval(%{i})+1"/> </list> </operator> <connect from_port="example set" to_op="Store" to_port="input"/> <connect from_op="Store" from_port="through" to_op="Increment counter" to_port="through 1"/> <connect from_op="Increment counter" from_port="through 1" to_port="output 1"/> <portSpacing port="source_example set" spacing="0"/> <portSpacing port="sink_performance" spacing="0"/> <portSpacing port="sink_output 1" spacing="0"/> <portSpacing port="sink_output 2" spacing="0"/> </process> </operator> <connect from_op="Retrieve Iris" from_port="output" to_op="Prepare counter" to_port="through 1"/> <connect from_op="Prepare counter" from_port="through 1" to_op="Multiply" to_port="input"/> <connect from_op="Multiply" from_port="output 1" to_op="Loop Data Sets" to_port="example set 1"/> <connect from_op="Multiply" from_port="output 2" to_op="Loop Data Sets" to_port="example set 2"/> <connect from_op="Multiply" from_port="output 3" to_op="Loop Data Sets" to_port="example set 3"/> <connect from_op="Loop Data Sets" from_port="output 1" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
Regards,
Marco
1
Answers
-
Seems like you need to use a nested loop values operator.
The first one you use to loop through the category_id's, then you loop through the set_id's, and then you do your logic. You can then save it using the stored macro values for both category and set id. As in attached simplified example<?xml version="1.0" encoding="UTF-8"?><process version="9.3.001"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.3.001" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="UTF-8"/> <process expanded="true"> <operator activated="true" class="utility:create_exampleset" compatibility="9.3.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="34"> <parameter key="generator_type" value="comma separated text"/> <parameter key="number_of_examples" value="100"/> <parameter key="use_stepsize" value="false"/> <list key="function_descriptions"/> <parameter key="add_id_attribute" value="false"/> <list key="numeric_series_configuration"/> <list key="date_series_configuration"/> <list key="date_series_configuration (interval)"/> <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/> <parameter key="time_zone" value="SYSTEM"/> <parameter key="input_csv_text" value="category_id,set_id,something 1,1,x 1,1,y 1,2,z 2,1,a 2,2,b 2,2,c "/> <parameter key="column_separator" value=","/> <parameter key="parse_all_as_nominal" value="true"/> <parameter key="decimal_point_character" value="."/> <parameter key="trim_attribute_names" value="true"/> </operator> <operator activated="true" class="concurrency:loop_values" compatibility="9.3.001" expanded="true" height="82" name="Loop Values" width="90" x="246" y="34"> <parameter key="attribute" value="category_id"/> <parameter key="iteration_macro" value="cid"/> <parameter key="reuse_results" value="false"/> <parameter key="enable_parallel_execution" value="false"/> <process expanded="true"> <operator activated="true" class="filter_examples" compatibility="9.3.001" expanded="true" height="103" name="Filter Examples" width="90" x="45" y="34"> <parameter key="parameter_expression" value=""/> <parameter key="condition_class" value="custom_filters"/> <parameter key="invert_filter" value="false"/> <list key="filters_list"> <parameter key="filters_entry_key" value="category_id.equals.%{cid}"/> </list> <parameter key="filters_logic_and" value="true"/> <parameter key="filters_check_metadata" value="true"/> </operator> <operator activated="true" class="concurrency:loop_values" compatibility="9.3.001" expanded="true" height="82" name="Loop Values (2)" width="90" x="179" y="34"> <parameter key="attribute" value="set_id"/> <parameter key="iteration_macro" value="sid"/> <parameter key="reuse_results" value="false"/> <parameter key="enable_parallel_execution" value="false"/> <process expanded="true"> <operator activated="true" class="filter_examples" compatibility="9.3.001" expanded="true" height="103" name="Filter Examples (2)" width="90" x="45" y="34"> <parameter key="parameter_expression" value=""/> <parameter key="condition_class" value="custom_filters"/> <parameter key="invert_filter" value="false"/> <list key="filters_list"> <parameter key="filters_entry_key" value="set_id.equals.%{sid}"/> </list> <parameter key="filters_logic_and" value="true"/> <parameter key="filters_check_metadata" value="true"/> </operator> <operator activated="true" breakpoints="before" class="write_csv" compatibility="9.3.001" expanded="true" height="82" name="Write CSV" width="90" x="179" y="34"> <parameter key="csv_file" value="mypath/%{cid}/%{sid}/filename.csv"/> <parameter key="column_separator" value=";"/> <parameter key="write_attribute_names" value="true"/> <parameter key="quote_nominal_values" value="true"/> <parameter key="format_date_attributes" value="true"/> <parameter key="append_to_file" value="false"/> <parameter key="encoding" value="UTF-8"/> </operator> <connect from_port="input 1" to_op="Filter Examples (2)" to_port="example set input"/> <connect from_op="Filter Examples (2)" from_port="example set output" to_op="Write CSV" to_port="input"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="source_input 2" spacing="0"/> <portSpacing port="sink_output 1" spacing="0"/> </process> </operator> <connect from_port="input 1" to_op="Filter Examples" to_port="example set input"/> <connect from_op="Filter Examples" from_port="example set output" to_op="Loop Values (2)" to_port="input 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="source_input 2" spacing="0"/> <portSpacing port="sink_output 1" spacing="0"/> <portSpacing port="sink_output 2" spacing="0"/> </process> </operator> <connect from_op="Create ExampleSet" from_port="output" to_op="Loop Values" to_port="input 1"/> <connect from_op="Loop Values" from_port="output 1" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
1 -
Hi, I modified my previous post.Custom values added to the path of csv file are not abtained from example set. It is just an index. I mean something like:index = 0;exampleSets = [A, B, C, D];foreach (exampleSets as exampleSet) {++index;path = index . '/example.csv';}1
-
Hi,
Just define a macro before the Loop, and then use it inside the Loop and increment it each iteration. Some Loops do it for you, but for your Loop you have to do it yourself. I would also recommend to check out Loop Collection, then you don't have to define 70 connections to the Loop Data Sets operator.. Anyway, It's quite easy, see the little example below:<?xml version="1.0" encoding="UTF-8"?><process version="9.4.001-SNAPSHOT"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.4.001-SNAPSHOT" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" class="retrieve" compatibility="9.4.001-SNAPSHOT" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="34"> <parameter key="repository_entry" value="//Samples/data/Iris"/> </operator> <operator activated="true" class="set_macro" compatibility="9.4.001-SNAPSHOT" expanded="true" height="82" name="Prepare counter" width="90" x="179" y="34"> <parameter key="macro" value="i"/> <parameter key="value" value="1"/> </operator> <operator activated="true" class="multiply" compatibility="9.4.001-SNAPSHOT" expanded="true" height="124" name="Multiply" width="90" x="313" y="34"/> <operator activated="true" class="loop_data_sets" compatibility="9.4.001-SNAPSHOT" expanded="true" height="124" name="Loop Data Sets" width="90" x="447" y="34"> <parameter key="only_best" value="false"/> <process expanded="true"> <operator activated="true" class="store" compatibility="9.4.001-SNAPSHOT" expanded="true" height="68" name="Store" width="90" x="179" y="34"> <parameter key="repository_entry" value="%{i} - myData"/> </operator> <operator activated="true" class="generate_macro" compatibility="9.4.001-SNAPSHOT" expanded="true" height="82" name="Increment counter" width="90" x="380" y="34"> <list key="function_descriptions"> <parameter key="i" value="eval(%{i})+1"/> </list> </operator> <connect from_port="example set" to_op="Store" to_port="input"/> <connect from_op="Store" from_port="through" to_op="Increment counter" to_port="through 1"/> <connect from_op="Increment counter" from_port="through 1" to_port="output 1"/> <portSpacing port="source_example set" spacing="0"/> <portSpacing port="sink_performance" spacing="0"/> <portSpacing port="sink_output 1" spacing="0"/> <portSpacing port="sink_output 2" spacing="0"/> </process> </operator> <connect from_op="Retrieve Iris" from_port="output" to_op="Prepare counter" to_port="through 1"/> <connect from_op="Prepare counter" from_port="through 1" to_op="Multiply" to_port="input"/> <connect from_op="Multiply" from_port="output 1" to_op="Loop Data Sets" to_port="example set 1"/> <connect from_op="Multiply" from_port="output 2" to_op="Loop Data Sets" to_port="example set 2"/> <connect from_op="Multiply" from_port="output 3" to_op="Loop Data Sets" to_port="example set 3"/> <connect from_op="Loop Data Sets" from_port="output 1" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
Regards,
Marco
1 -
Thanks! One last ask, can you check my process now (and sorry for polish descriptions above operators)? I hope that now it is ok...0
-
Hi,
If you put the input CSV files into one folder, you could use Loop Files and use a single Read CSV instead of multiple, but other than that, the macro thing looks fine.
Regards,
Marco1