How to get forecast values of future from time series data

syedghouri68
syedghouri68 New Altair Community Member
edited November 5 in Community Q&A

Hello

I am a week old rapidminer user facing difficulty in developing time series models. I have been using R for about 6 months now and was successful in integrating R scripts with rapidminer without any hassle. 

In R we have a forecast function which allows us to set the future periods to be forecasted and model variable which then gives use the forecasted values of given future period. This is exactly what I intergrated in rapidminer and got the forecast values using ARIMA model.

 

My question is How can I get the forecast values using ARIMA in rapidminer itself without integrating R script. Most of the examples I have seen on the web does the model evaluation on training data only. In simple terms I have historic weekly data as

week units

week1 20

week2 35

week3 27

week4 12

......

week500 43

 

I need forecast values for 

week501 ?

week502 ?

week503 ?

...

week552?

 

links I referred:

http://www.simafore.com/blog/bid/109175/Time-Series-Forecasting-using-RapidMiner-for-cost-modeling-2-of-2

https://www.youtube.com/watch?v=w0vSSEq2bn0

 

Thank you

Best Answer

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    Well then you're going to love this p,q,d optimizing process. Make sure you have the Fin/Econ extension installed too, it pulls some sample data.  This process optimizings around the AIC.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="quantx1:yahoo_historical_data_extractor" compatibility="1.0.006" expanded="true" height="82" name="Yahoo Historical Stock Data" width="90" x="45" y="120">
    <parameter key="I agree to abide by Yahoo's Terms &amp; Conditions on financial data usage" value="true"/>
    <parameter key="Quick Stock Ticker Data" value="true"/>
    <parameter key="Stock Ticker" value="S&amp;P"/>
    <parameter key="select_fields" value="VOLUME|OPEN|DAY_LOW|DAY_HIGH|CLOSE|ADJUSTED_CLOSE"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    <parameter key="date_start" value="2013-01-01"/>
    <parameter key="date_end" value="2015-06-03"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.4.000" expanded="true" height="82" name="Rename" width="90" x="179" y="120">
    <parameter key="old_name" value="S&amp;P_ADJUSTED_CLOSE"/>
    <parameter key="new_name" value="AClose"/>
    <list key="rename_additional_attributes">
    <parameter key="S&amp;P_CLOSE" value="Close"/>
    <parameter key="S&amp;P_DAY_HIGH" value="High"/>
    <parameter key="S&amp;P_DAY_LOW" value="Low"/>
    <parameter key="S&amp;P_OPEN" value="Open"/>
    <parameter key="S&amp;P_VOLUME" value="Volume"/>
    </list>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.4.000" expanded="true" height="124" name="Multiply" width="90" x="313" y="120"/>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Forecasting" width="90" x="715" y="435">
    <parameter key="script" value="### Call this R scripts to get AIC from ARIMA models&#10;rm_main = function(data)&#10;{&#10; &#9;library(forecast)&#10; &#9;sp &lt;- data&#10;&#9;sp$Date &lt;- as.Date(sp$Date)&#10;&#9;arima &lt;- arima(ts(sp$Close), order=c(3,1,3))&#10;&#9;print(arima)&#10;&#9;arimaforecast &lt;- forecast.Arima(arima, h=5)&#10;&#9;print(arimaforecast)&#10; &#9;return(as.data.frame(arimaforecast))&#10;}&#10;"/>
    </operator>
    <operator activated="true" class="optimize_parameters_grid" compatibility="7.4.000" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="514" y="300">
    <list key="parameters">
    <parameter key="Set p.value" value="[0;3;3;linear]"/>
    <parameter key="Set d.value" value="[0.0;2;2;linear]"/>
    <parameter key="Set q.value" value="[0.0;4;4;linear]"/>
    </list>
    <process expanded="true">
    <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set p" width="90" x="112" y="30">
    <parameter key="macro" value="p"/>
    <parameter key="value" value="3.0"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set d" width="90" x="112" y="120">
    <parameter key="macro" value="d"/>
    <parameter key="value" value="2.0"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set q" width="90" x="112" y="210">
    <parameter key="macro" value="q"/>
    <parameter key="value" value="4.0"/>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="112" name="ARIMA" width="90" x="447" y="75">
    <parameter key="script" value="### Call this R scripts to get AIC from ARIMA models&#10;rm_main = function(data)&#10;{&#10; &#9;sp &lt;- data&#10;&#9;sp$Date &lt;- as.Date(sp$Date)&#10;&#9;arima &lt;- arima(sp$Close, order=c(%{p},%{d},%{q}))&#10;&#9;#print(arima$aic)&#10; &#9;return(as.data.table(arima$aic))&#10;}&#10;"/>
    <description align="center" color="transparent" colored="false" width="126">Fit ARIMA model in R with diffeferent(p,d,q)</description>
    </operator>
    <operator activated="true" class="extract_performance" compatibility="7.4.000" expanded="true" height="76" name="Performance" width="90" x="581" y="75">
    <parameter key="performance_type" value="data_value"/>
    <parameter key="attribute_name" value="V1"/>
    <parameter key="example_index" value="1"/>
    <parameter key="optimization_direction" value="minimize"/>
    </operator>
    <operator activated="true" class="log" compatibility="7.4.000" expanded="true" height="76" name="Log" width="90" x="715" y="75">
    <list key="log">
    <parameter key="aic" value="operator.Performance.value.performance"/>
    <parameter key="p" value="operator.Set p.parameter.value"/>
    <parameter key="d" value="operator.Set d.parameter.value"/>
    <parameter key="q" value="operator.Set q.parameter.value"/>
    </list>
    </operator>
    <connect from_port="input 1" to_op="Set p" to_port="through 1"/>
    <connect from_op="Set p" from_port="through 1" to_op="ARIMA" to_port="input 1"/>
    <connect from_op="Set d" from_port="through 1" to_op="ARIMA" to_port="input 2"/>
    <connect from_op="Set q" from_port="through 1" to_op="ARIMA" to_port="input 3"/>
    <connect from_op="ARIMA" from_port="output 1" to_op="Performance" to_port="example set"/>
    <connect from_op="Performance" from_port="performance" to_op="Log" to_port="through 1"/>
    <connect from_op="Log" from_port="through 1" to_port="performance"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="36"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Yahoo Historical Stock Data" from_port="example set" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
    <connect from_op="Multiply" from_port="output 3" to_op="Forecasting" to_port="input 1"/>
    <connect from_op="Forecasting" from_port="output 1" to_port="result 3"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="90"/>
    <portSpacing port="sink_result 2" spacing="162"/>
    <portSpacing port="sink_result 3" spacing="126"/>
    <portSpacing port="sink_result 4" spacing="36"/>
    <description align="center" color="yellow" colored="false" height="62" resized="true" width="816" x="305" y="18">Look at Economic Time Series Data (automatically pulled) from public sites and integrate with ARIMA in R extension</description>
    <description align="center" color="yellow" colored="false" height="133" resized="true" width="635" x="490" y="83">Charts for data. Identify any unusual observations for all attributes: day low, high, open, close, adjusted close, volumn</description>
    <description align="center" color="yellow" colored="false" height="177" resized="true" width="626" x="500" y="228">Find the optimized parameter for ARIMA (iterative, and TAKE TIME!! about 1 min)&lt;br&gt;Use R extension for ARIMA models&lt;br&gt;for this demo data, we have ARIMA(3,1,3) as the best fit&lt;br/&gt;To chose the best fit model: check Log result, rank by AIC&lt;br/&gt;and find the values of p, d, q corresponding to min AIC</description>
    <description align="center" color="yellow" colored="false" height="116" resized="true" width="415" x="713" y="414">Apply ARIMA(3,1,3) for forcasting&lt;br&gt;predict the next 5 days close price&lt;br&gt;</description>
    </process>
    </operator>
    </process>

Answers

  • Thomas_Ott
    Thomas_Ott New Altair Community Member

    Hi,

     

    There is no native ARIMA operator in the Series Extension, yet. However you can try to tweak the attached process. This process comes from Bala's and Vijays book on forecasting point values in RapidMiner based on the previous time series patterns. 

     

    W.R.T. to ARIMA on training data, you can embed your ARIMA R script inside a Sliding Window Validation operator and test the Perfomance. 

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <context>
    <input/>
    <output/>
    <macros>
    <macro>
    <key>horizon</key>
    <value>5</value>
    </macro>
    <macro>
    <key>symbol</key>
    <value>XOM</value>
    </macro>
    <macro>
    <key>start_date</key>
    <value>2016-01-01</value>
    </macro>
    <macro>
    <key>end_date</key>
    <value>2017-03-21</value>
    </macro>
    </macros>
    </context>
    <operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="quantx1:yahoo_historical_data_extractor" compatibility="1.0.006" expanded="true" height="82" name="Yahoo Historical Stock Data" width="90" x="45" y="34">
    <parameter key="I agree to abide by Yahoo's Terms &amp; Conditions on financial data usage" value="true"/>
    <parameter key="Quick Stock Ticker Data" value="true"/>
    <parameter key="Stock Ticker" value="%{symbol}"/>
    <parameter key="select_fields" value="CLOSE"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    <parameter key="date_start" value="%{start_date}"/>
    <parameter key="date_end" value="%{end_date}"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="5.3.013" expanded="true" height="82" name="Set Role" width="90" x="179" y="34">
    <parameter key="attribute_name" value="Date"/>
    <parameter key="target_role" value="id"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.4.000" expanded="true" height="82" name="Rename" width="90" x="179" y="136">
    <parameter key="old_name" value="%{symbol}_CLOSE"/>
    <parameter key="new_name" value="Close"/>
    <list key="rename_additional_attributes"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.4.000" expanded="true" height="82" name="Select Attributes" width="90" x="179" y="238">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="Close|Date"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="6.4.000" expanded="true" height="103" name="Filter Examples" width="90" x="179" y="340">
    <parameter key="condition_class" value="no_missing_attributes"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing" width="90" x="380" y="34">
    <parameter key="window_size" value="6"/>
    <parameter key="create_label" value="true"/>
    <parameter key="label_attribute" value="Close"/>
    </operator>
    <operator activated="true" class="series:windowing" compatibility="7.4.000" expanded="true" height="82" name="Windowing (2)" width="90" x="380" y="136">
    <parameter key="window_size" value="6"/>
    <parameter key="label_attribute" value="Close"/>
    </operator>
    <operator activated="true" class="extract_macro" compatibility="7.4.000" expanded="true" height="68" name="Extract Macro" width="90" x="380" y="238">
    <parameter key="macro" value="n_examples"/>
    <list key="additional_macros"/>
    </operator>
    <operator activated="true" class="generate_macro" compatibility="7.4.000" expanded="true" height="82" name="Generate Macro" width="90" x="380" y="340">
    <list key="function_descriptions">
    <parameter key="filter_range" value="eval(%{n_examples})-1"/>
    </list>
    </operator>
    <operator activated="true" class="filter_example_range" compatibility="7.4.000" expanded="true" height="82" name="Filter Example Range" width="90" x="380" y="442">
    <parameter key="first_example" value="1"/>
    <parameter key="last_example" value="%{filter_range}"/>
    <parameter key="invert_filter" value="true"/>
    </operator>
    <operator activated="true" class="remember" compatibility="7.4.000" expanded="true" height="68" name="Remember" width="90" x="514" y="442">
    <parameter key="name" value="LastRow"/>
    </operator>
    <operator activated="true" class="optimize_parameters_grid" compatibility="7.4.000" expanded="true" height="124" name="Optimize Parameters (Grid)" width="90" x="514" y="34">
    <list key="parameters">
    <parameter key="SVM.kernel_gamma" value="[0.001;1000;6;logarithmic]"/>
    <parameter key="SVM.C" value="[0;1000;10;linear]"/>
    </list>
    <process expanded="true">
    <operator activated="true" class="series:sliding_window_validation" compatibility="7.4.000" expanded="true" height="124" name="Validation" width="90" x="112" y="34">
    <parameter key="training_window_width" value="20"/>
    <parameter key="training_window_step_size" value="1"/>
    <parameter key="test_window_width" value="20"/>
    <parameter key="horizon" value="%{horizon}"/>
    <process expanded="true">
    <operator activated="true" class="support_vector_machine" compatibility="7.4.000" expanded="true" height="124" name="SVM" width="90" x="179" y="34">
    <parameter key="kernel_type" value="radial"/>
    <parameter key="kernel_gamma" value="0.009999999999999998"/>
    </operator>
    <connect from_port="training" to_op="SVM" to_port="training set"/>
    <connect from_op="SVM" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="7.4.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="45" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="series:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance" width="90" x="246" y="34">
    <parameter key="horizon" value="%{horizon}"/>
    <parameter key="main_criterion" value="prediction_trend_accuracy"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
    <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="log" compatibility="7.4.000" expanded="true" height="82" name="Log" width="90" x="313" y="85">
    <parameter key="filename" value="tmp"/>
    <list key="log">
    <parameter key="Gamma" value="operator.SVM.parameter.kernel_gamma"/>
    <parameter key="C" value="operator.SVM.parameter.C"/>
    <parameter key="Forecast Perf" value="operator.Validation.value.performance"/>
    </list>
    </operator>
    <connect from_port="input 1" to_op="Validation" to_port="training"/>
    <connect from_op="Validation" from_port="model" to_port="result 1"/>
    <connect from_op="Validation" from_port="averagable 1" to_op="Log" to_port="through 1"/>
    <connect from_op="Log" from_port="through 1" to_port="performance"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="series:sliding_window_validation" compatibility="7.4.000" expanded="true" height="124" name="Validation For the Masses" width="90" x="514" y="289">
    <parameter key="training_window_width" value="8"/>
    <parameter key="training_window_step_size" value="1"/>
    <parameter key="test_window_width" value="8"/>
    <parameter key="horizon" value="%{horizon}"/>
    <process expanded="true">
    <operator activated="true" class="support_vector_machine" compatibility="7.4.000" expanded="true" height="124" name="SVM (2)" width="90" x="217" y="34">
    <parameter key="kernel_type" value="radial"/>
    <parameter key="kernel_gamma" value="0.1"/>
    <parameter key="C" value="1000.0"/>
    </operator>
    <connect from_port="training" to_op="SVM (2)" to_port="training set"/>
    <connect from_op="SVM (2)" from_port="model" to_port="model"/>
    <portSpacing port="source_training" spacing="0"/>
    <portSpacing port="sink_model" spacing="0"/>
    <portSpacing port="sink_through 1" spacing="0"/>
    </process>
    <process expanded="true">
    <operator activated="true" class="apply_model" compatibility="7.4.000" expanded="true" height="82" name="Apply Model (3)" width="90" x="45" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="series:forecasting_performance" compatibility="7.4.000" expanded="true" height="82" name="Performance (2)" width="90" x="246" y="34">
    <parameter key="horizon" value="%{horizon}"/>
    <parameter key="main_criterion" value="prediction_trend_accuracy"/>
    </operator>
    <connect from_port="model" to_op="Apply Model (3)" to_port="model"/>
    <connect from_port="test set" to_op="Apply Model (3)" to_port="unlabelled data"/>
    <connect from_op="Apply Model (3)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
    <connect from_op="Performance (2)" from_port="performance" to_port="averagable 1"/>
    <portSpacing port="source_model" spacing="0"/>
    <portSpacing port="source_test set" spacing="0"/>
    <portSpacing port="source_through 1" spacing="0"/>
    <portSpacing port="sink_averagable 1" spacing="0"/>
    <portSpacing port="sink_averagable 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="loop" compatibility="7.4.000" expanded="true" height="82" name="Loop" width="90" x="648" y="34">
    <parameter key="set_iteration_macro" value="true"/>
    <parameter key="macro_name" value="loop_forecasts"/>
    <parameter key="iterations" value="%{horizon}"/>
    <process expanded="true">
    <operator activated="true" class="recall" compatibility="7.4.000" expanded="true" height="68" name="Recall" width="90" x="45" y="85">
    <parameter key="name" value="LastRow"/>
    <parameter key="remove_from_store" value="false"/>
    </operator>
    <operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="246" y="34">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="generate_attributes" compatibility="7.4.000" expanded="true" height="82" name="Generate Attributes" width="90" x="380" y="34">
    <list key="function_descriptions">
    <parameter key="Date" value="date_add(Date,eval(%{loop_forecasts}),DATE_UNIT_DAY)"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="5.3.013" expanded="true" height="82" name="Set Role (2)" width="90" x="514" y="34">
    <parameter key="attribute_name" value="prediction(label)"/>
    <list key="set_additional_roles"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="7.4.000" expanded="true" height="82" name="Select Attributes (3)" width="90" x="648" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="prediction(label)"/>
    </operator>
    <operator activated="true" class="replace" compatibility="7.4.000" expanded="true" height="82" name="Replace" width="90" x="782" y="34">
    <parameter key="replace_what" value="Close"/>
    <parameter key="replace_by" value="$1-"/>
    </operator>
    <operator activated="true" class="materialize_data" compatibility="7.4.000" expanded="true" height="82" name="Materialize Data (2)" width="90" x="916" y="34"/>
    <connect from_port="input 1" to_op="Apply Model" to_port="model"/>
    <connect from_op="Recall" from_port="result" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role (2)" to_port="example set input"/>
    <connect from_op="Set Role (2)" from_port="example set output" to_op="Select Attributes (3)" to_port="example set input"/>
    <connect from_op="Select Attributes (3)" from_port="example set output" to_op="Replace" to_port="example set input"/>
    <connect from_op="Replace" from_port="example set output" to_op="Materialize Data (2)" to_port="example set input"/>
    <connect from_op="Materialize Data (2)" from_port="example set output" to_port="output 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_output 1" spacing="0"/>
    <portSpacing port="sink_output 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="append" compatibility="7.4.000" expanded="true" height="82" name="Append" width="90" x="782" y="34"/>
    <connect from_op="Yahoo Historical Stock Data" from_port="example set" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
    <connect from_op="Filter Examples" from_port="example set output" to_op="Windowing" to_port="example set input"/>
    <connect from_op="Windowing" from_port="example set output" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
    <connect from_op="Windowing" from_port="original" to_op="Windowing (2)" to_port="example set input"/>
    <connect from_op="Windowing (2)" from_port="example set output" to_op="Extract Macro" to_port="example set"/>
    <connect from_op="Extract Macro" from_port="example set" to_op="Generate Macro" to_port="through 1"/>
    <connect from_op="Generate Macro" from_port="through 1" to_op="Filter Example Range" to_port="example set input"/>
    <connect from_op="Filter Example Range" from_port="example set output" to_op="Remember" to_port="store"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 2"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="result 1" to_op="Loop" to_port="input 1"/>
    <connect from_op="Loop" from_port="output 1" to_op="Append" to_port="example set 1"/>
    <connect from_op="Append" from_port="merged set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="231"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>
  • syedghouri68
    syedghouri68 New Altair Community Member

    Many thanks for your reply.

     

    I managed to fit the ARIMA with some constant p and q values. However, I am having hard time trying different combination of p and q as I have to do it manually.

     

    Syed

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    Well then you're going to love this p,q,d optimizing process. Make sure you have the Fin/Econ extension installed too, it pulls some sample data.  This process optimizings around the AIC.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
    <parameter key="encoding" value="SYSTEM"/>
    <process expanded="true">
    <operator activated="true" class="quantx1:yahoo_historical_data_extractor" compatibility="1.0.006" expanded="true" height="82" name="Yahoo Historical Stock Data" width="90" x="45" y="120">
    <parameter key="I agree to abide by Yahoo's Terms &amp; Conditions on financial data usage" value="true"/>
    <parameter key="Quick Stock Ticker Data" value="true"/>
    <parameter key="Stock Ticker" value="S&amp;P"/>
    <parameter key="select_fields" value="VOLUME|OPEN|DAY_LOW|DAY_HIGH|CLOSE|ADJUSTED_CLOSE"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    <parameter key="date_start" value="2013-01-01"/>
    <parameter key="date_end" value="2015-06-03"/>
    </operator>
    <operator activated="true" class="rename" compatibility="7.4.000" expanded="true" height="82" name="Rename" width="90" x="179" y="120">
    <parameter key="old_name" value="S&amp;P_ADJUSTED_CLOSE"/>
    <parameter key="new_name" value="AClose"/>
    <list key="rename_additional_attributes">
    <parameter key="S&amp;P_CLOSE" value="Close"/>
    <parameter key="S&amp;P_DAY_HIGH" value="High"/>
    <parameter key="S&amp;P_DAY_LOW" value="Low"/>
    <parameter key="S&amp;P_OPEN" value="Open"/>
    <parameter key="S&amp;P_VOLUME" value="Volume"/>
    </list>
    </operator>
    <operator activated="true" class="multiply" compatibility="7.4.000" expanded="true" height="124" name="Multiply" width="90" x="313" y="120"/>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Forecasting" width="90" x="715" y="435">
    <parameter key="script" value="### Call this R scripts to get AIC from ARIMA models&#10;rm_main = function(data)&#10;{&#10; &#9;library(forecast)&#10; &#9;sp &lt;- data&#10;&#9;sp$Date &lt;- as.Date(sp$Date)&#10;&#9;arima &lt;- arima(ts(sp$Close), order=c(3,1,3))&#10;&#9;print(arima)&#10;&#9;arimaforecast &lt;- forecast.Arima(arima, h=5)&#10;&#9;print(arimaforecast)&#10; &#9;return(as.data.frame(arimaforecast))&#10;}&#10;"/>
    </operator>
    <operator activated="true" class="optimize_parameters_grid" compatibility="7.4.000" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="514" y="300">
    <list key="parameters">
    <parameter key="Set p.value" value="[0;3;3;linear]"/>
    <parameter key="Set d.value" value="[0.0;2;2;linear]"/>
    <parameter key="Set q.value" value="[0.0;4;4;linear]"/>
    </list>
    <process expanded="true">
    <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set p" width="90" x="112" y="30">
    <parameter key="macro" value="p"/>
    <parameter key="value" value="3.0"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set d" width="90" x="112" y="120">
    <parameter key="macro" value="d"/>
    <parameter key="value" value="2.0"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set q" width="90" x="112" y="210">
    <parameter key="macro" value="q"/>
    <parameter key="value" value="4.0"/>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="112" name="ARIMA" width="90" x="447" y="75">
    <parameter key="script" value="### Call this R scripts to get AIC from ARIMA models&#10;rm_main = function(data)&#10;{&#10; &#9;sp &lt;- data&#10;&#9;sp$Date &lt;- as.Date(sp$Date)&#10;&#9;arima &lt;- arima(sp$Close, order=c(%{p},%{d},%{q}))&#10;&#9;#print(arima$aic)&#10; &#9;return(as.data.table(arima$aic))&#10;}&#10;"/>
    <description align="center" color="transparent" colored="false" width="126">Fit ARIMA model in R with diffeferent(p,d,q)</description>
    </operator>
    <operator activated="true" class="extract_performance" compatibility="7.4.000" expanded="true" height="76" name="Performance" width="90" x="581" y="75">
    <parameter key="performance_type" value="data_value"/>
    <parameter key="attribute_name" value="V1"/>
    <parameter key="example_index" value="1"/>
    <parameter key="optimization_direction" value="minimize"/>
    </operator>
    <operator activated="true" class="log" compatibility="7.4.000" expanded="true" height="76" name="Log" width="90" x="715" y="75">
    <list key="log">
    <parameter key="aic" value="operator.Performance.value.performance"/>
    <parameter key="p" value="operator.Set p.parameter.value"/>
    <parameter key="d" value="operator.Set d.parameter.value"/>
    <parameter key="q" value="operator.Set q.parameter.value"/>
    </list>
    </operator>
    <connect from_port="input 1" to_op="Set p" to_port="through 1"/>
    <connect from_op="Set p" from_port="through 1" to_op="ARIMA" to_port="input 1"/>
    <connect from_op="Set d" from_port="through 1" to_op="ARIMA" to_port="input 2"/>
    <connect from_op="Set q" from_port="through 1" to_op="ARIMA" to_port="input 3"/>
    <connect from_op="ARIMA" from_port="output 1" to_op="Performance" to_port="example set"/>
    <connect from_op="Performance" from_port="performance" to_op="Log" to_port="through 1"/>
    <connect from_op="Log" from_port="through 1" to_port="performance"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="36"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    </process>
    </operator>
    <connect from_op="Yahoo Historical Stock Data" from_port="example set" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
    <connect from_op="Multiply" from_port="output 3" to_op="Forecasting" to_port="input 1"/>
    <connect from_op="Forecasting" from_port="output 1" to_port="result 3"/>
    <connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="90"/>
    <portSpacing port="sink_result 2" spacing="162"/>
    <portSpacing port="sink_result 3" spacing="126"/>
    <portSpacing port="sink_result 4" spacing="36"/>
    <description align="center" color="yellow" colored="false" height="62" resized="true" width="816" x="305" y="18">Look at Economic Time Series Data (automatically pulled) from public sites and integrate with ARIMA in R extension</description>
    <description align="center" color="yellow" colored="false" height="133" resized="true" width="635" x="490" y="83">Charts for data. Identify any unusual observations for all attributes: day low, high, open, close, adjusted close, volumn</description>
    <description align="center" color="yellow" colored="false" height="177" resized="true" width="626" x="500" y="228">Find the optimized parameter for ARIMA (iterative, and TAKE TIME!! about 1 min)&lt;br&gt;Use R extension for ARIMA models&lt;br&gt;for this demo data, we have ARIMA(3,1,3) as the best fit&lt;br/&gt;To chose the best fit model: check Log result, rank by AIC&lt;br/&gt;and find the values of p, d, q corresponding to min AIC</description>
    <description align="center" color="yellow" colored="false" height="116" resized="true" width="415" x="713" y="414">Apply ARIMA(3,1,3) for forcasting&lt;br&gt;predict the next 5 days close price&lt;br&gt;</description>
    </process>
    </operator>
    </process>
  • syedghouri68
    syedghouri68 New Altair Community Member

    Wow. Great.

    Many thanks for your help. 

     

    Syed