Parallel execution of the
maciek
New Altair Community Member
Dear Community,
my question is - how (or is this possible) to achieve parallel execution of the "Execute R" operator inside a "Loop" operator.
I have a Collection of example sets and for every example set I want to perform some calculation using the "Execute R" operator. I enable the parallel execution checkbox expecting that it will run Rscript.exe in parallel. However, it is not.
I attach an example process:
<?xml version="1.0" encoding="UTF-8"?><process version="9.2.001"><br> <context><br> <input/><br> <output/><br> <macros/><br> </context><br> <operator activated="true" class="process" compatibility="9.2.001" expanded="true" name="Process"><br> <parameter key="logverbosity" value="init"/><br> <parameter key="random_seed" value="2001"/><br> <parameter key="send_mail" value="never"/><br> <parameter key="notification_email" value=""/><br> <parameter key="process_duration_for_mail" value="30"/><br> <parameter key="encoding" value="SYSTEM"/><br> <process expanded="true"><br> <operator activated="true" class="concurrency:loop" compatibility="9.2.001" expanded="true" height="82" name="Generate Data" width="90" x="179" y="187"><br> <parameter key="number_of_iterations" value="50"/><br> <parameter key="iteration_macro" value="iteration"/><br> <parameter key="reuse_results" value="false"/><br> <parameter key="enable_parallel_execution" value="true"/><br> <process expanded="true"><br> <operator activated="true" class="utility:create_exampleset" compatibility="9.2.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="85"><br> <parameter key="generator_type" value="numeric series"/><br> <parameter key="number_of_examples" value="1000"/><br> <parameter key="use_stepsize" value="true"/><br> <list key="function_descriptions"/><br> <parameter key="add_id_attribute" value="false"/><br> <list key="numeric_series_configuration"><br> <parameter key="Att1" value="linear.0\.0.1\.0"/><br> </list><br> <list key="date_series_configuration"/><br> <list key="date_series_configuration (interval)"/><br> <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br> <parameter key="time_zone" value="SYSTEM"/><br> <parameter key="column_separator" value=","/><br> <parameter key="parse_all_as_nominal" value="false"/><br> <parameter key="decimal_point_character" value="."/><br> <parameter key="trim_attribute_names" value="true"/><br> </operator><br> <connect from_op="Create ExampleSet" from_port="output" to_port="output 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="sink_output 1" spacing="0"/><br> <portSpacing port="sink_output 2" spacing="0"/><br> </process><br> </operator><br> <operator activated="true" class="concurrency:loop" compatibility="9.2.001" expanded="true" height="82" name="Execute R Script" width="90" x="380" y="187"><br> <parameter key="number_of_iterations" value="50"/><br> <parameter key="iteration_macro" value="iteration"/><br> <parameter key="reuse_results" value="false"/><br> <parameter key="enable_parallel_execution" value="true"/><br> <process expanded="true"><br> <operator activated="true" class="select" compatibility="9.2.001" expanded="true" height="68" name="Select" width="90" x="179" y="85"><br> <parameter key="index" value="%{iteration}"/><br> <parameter key="unfold" value="false"/><br> </operator><br> <operator activated="true" class="r_scripting:execute_r" compatibility="9.1.000" expanded="true" height="103" name="Execute R" width="90" x="380" y="85"><br> <parameter key="script" value="# rm_main is a mandatory function, # the number of arguments has to be the number of input ports (can be none) rm_main = function(data) { 	res <- mean(data$Att1) 	library(qdapTools) 	return (list2df(res)) } "/><br> </operator><br> <connect from_port="input 1" to_op="Select" to_port="collection"/><br> <connect from_op="Select" from_port="selected" to_op="Execute R" to_port="input 1"/><br> <connect from_op="Execute R" from_port="output 1" to_port="output 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="source_input 2" spacing="0"/><br> <portSpacing port="sink_output 1" spacing="0"/><br> <portSpacing port="sink_output 2" spacing="0"/><br> </process><br> </operator><br> <connect from_op="Generate Data" from_port="output 1" to_op="Execute R Script" to_port="input 1"/><br> <connect from_op="Execute R Script" from_port="output 1" to_port="result 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="sink_result 1" spacing="0"/><br> <portSpacing port="sink_result 2" spacing="0"/><br> </process><br> </operator><br></process>
Could you please help me, how to fix that?
Regards
Maciek
Tagged:
0
Best Answers
-
Hi Maciek,
I ran a similar process and it worked in parallel. Could you please check Settings - Manage Licenses, to make sure your current license will handle the multiple logical processors?3
Answers
-
Hi Maciek,
I ran a similar process and it worked in parallel. Could you please check Settings - Manage Licenses, to make sure your current license will handle the multiple logical processors?3 -
I just tested it with my Studio Large license and it runs in parallel:I modified the script because I didn't want to install the library:
# rm_main is a mandatory function, <br># the number of arguments has to be the number of input ports (can be none)<br>rm_main = function(data)<br>{<br> res <- mean(data$Att1)<br> data$Att1 = data$Att1 - res<br> return (data)<br>}<br>
I think it could either be a license or a OS (macos, linux) problem. In case of mac/linux, what Java package are you using?Also note that if the computation is very intensive, it may be better to move the parallelization to R. You also have full access to your hardware in R (sorry, I had to say it).Regards,Sebastian
3