"Time series forecast (with Rapid Miner)"
DaiWizard
New Altair Community Member
Hi!
I've set up a model exactly as described by Thomas Ott of 'neuralmarkettrends' in videos 8-10 - and it's working well so far.
But what I would still need is the output of the probability for the predicted label (horizon = 1). The model only gives the average values in form of
prediction_trend_accuracy: 0.807 +/- 0.067 (mikro: 0.807).
Thanks for your help !
I've set up a model exactly as described by Thomas Ott of 'neuralmarkettrends' in videos 8-10 - and it's working well so far.
But what I would still need is the output of the probability for the predicted label (horizon = 1). The model only gives the average values in form of
prediction_trend_accuracy: 0.807 +/- 0.067 (mikro: 0.807).
Thanks for your help !
Tagged:
0
Answers
-
Hello.
I'm now using Google to find the video you describe.
Next time please use a direct link to the video that is of interest.
Video link:
https://www.youtube.com/watch?v=UmGIGEJMmN8
Can you upload your process?
As far as I understand the process is as follows:
- Order your data by date
- Split your data into two parts
- Use data before date X for training, use data after date X for testing.
- Features for training use created using windowing
- SVM is used as learner
* This process does not deal with horizons very well, neuralmarkettrends1 is aware of this fact, but does not want to complicate his video
Now to answer your question:
My suggestion would be to rescale absolute error to fall into range 0 to 1, and use this as a measure of probability.
This is the best answer I can give right now.
You need to provide better information to get a better answer.
Best regards,
Wessel0 -
Thank you wessel for your answer!
You are right the question was a bit too unprecise, however you got it right that's the way I'm doing it.
Unfortunately I don't know what to do exactly regarding your answer "Now to answer your question:
My suggestion would be to rescale absolute error to fall into range 0 to 1, and use this as a measure of probabilit".
Where do I get the absolute error from ?
Thank you in advance !0 -
Using this process you can define any performance measure you want.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="generate_data" compatibility="5.3.008" expanded="true" height="60" name="Gen TS" width="90" x="45" y="30">
<parameter key="target_function" value="driller oscillation timeseries"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Create Sum" width="90" x="180" y="30">
<list key="function_descriptions">
<parameter key="sum" value="str(11*att1+22*att2+33*att3+44*att4+att5)"/>
</list>
</operator>
<operator activated="true" class="guess_types" compatibility="5.3.008" expanded="true" height="76" name="Guess Types" width="90" x="315" y="30"/>
<operator activated="true" class="select_attributes" compatibility="5.3.008" expanded="true" height="76" name="Select Sum" width="90" x="450" y="30">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="sum"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.3.008" expanded="true" height="94" name="Normalize" width="90" x="585" y="30">
<parameter key="method" value="range transformation"/>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="76" name="Win 3 2" width="90" x="720" y="30">
<parameter key="window_size" value="3"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="sum"/>
<parameter key="horizon" value="2"/>
</operator>
<operator activated="true" class="series:predict_series" compatibility="5.3.000" expanded="true" height="60" name="Predict: 22 5 22" width="90" x="45" y="120">
<parameter key="window_width" value="15"/>
<parameter key="horizon" value="2"/>
<parameter key="max_training_set_size" value="15"/>
<process expanded="true">
<operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance Vector Machine" width="90" x="45" y="30"/>
<connect from_port="window example set" to_op="Relevance Vector Machine" to_port="training set"/>
<connect from_op="Relevance Vector Machine" from_port="model" to_port="prediction model"/>
<portSpacing port="source_window example set" spacing="0"/>
<portSpacing port="sink_prediction model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename" width="90" x="180" y="120">
<parameter key="old_name" value="prediction(label)"/>
<parameter key="new_name" value="pred"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Attributes" width="90" x="315" y="120">
<list key="function_descriptions">
<parameter key="pred_times_label" value="pred*label"/>
<parameter key="pred_times_label_greater_0" value="if(pred*label>=0, 1, 0)"/>
<parameter key="abs_pred_minus_label" value="abs(pred-label)"/>
</list>
</operator>
<operator activated="true" class="extract_performance" compatibility="5.3.008" expanded="true" height="76" name="Performance" width="90" x="469" y="119">
<parameter key="performance_type" value="statistics"/>
<parameter key="attribute_name" value="abs_pred_minus_label"/>
</operator>
<connect from_op="Gen TS" from_port="output" to_op="Create Sum" to_port="example set input"/>
<connect from_op="Create Sum" from_port="example set output" to_op="Guess Types" to_port="example set input"/>
<connect from_op="Guess Types" from_port="example set output" to_op="Select Sum" to_port="example set input"/>
<connect from_op="Select Sum" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Win 3 2" to_port="example set input"/>
<connect from_op="Win 3 2" from_port="example set output" to_op="Predict: 22 5 22" to_port="example set"/>
<connect from_op="Predict: 22 5 22" from_port="example set" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Performance" to_port="example set"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<connect from_op="Performance" from_port="example set" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
0 -
You should get a result looking like this:
( I have problems uploading images, will edit this image later, just go into results dataset and plot "predicted" and "label" and maybe "abs_pred_minus_label" ).
Try figure out why absolute error is different from average(abs_pred_minus_label)
Also note that I'm not using a fixed split, instead I'm using a sliding window validation, because this is the proper way to validate time series models).
This XML shows how you can use the Regression Performance Operator.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="subprocess" compatibility="5.3.008" expanded="true" height="76" name="Generate Data (6)" width="90" x="45" y="30">
<process expanded="true">
<operator activated="true" class="generate_data" compatibility="5.3.008" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
<parameter key="target_function" value="driller oscillation timeseries"/>
<parameter key="number_examples" value="200"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Sum" width="90" x="180" y="30">
<list key="function_descriptions">
<parameter key="sum" value="str(11*att1+22*att2+33*att3+44*att4+att5)"/>
</list>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.3.008" expanded="true" height="76" name="Select Sum" width="90" x="319" y="29">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="sum"/>
</operator>
<operator activated="true" class="parse_numbers" compatibility="5.3.008" expanded="true" height="76" name="Parse Numbers (2)" width="90" x="441" y="26">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="sum"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.3.008" expanded="true" height="94" name="Normalize" width="90" x="561" y="27">
<parameter key="method" value="range transformation"/>
</operator>
<operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename Label" width="90" x="699" y="28">
<parameter key="old_name" value="sum"/>
<parameter key="new_name" value="label"/>
<list key="rename_additional_attributes"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Generate Sum" to_port="example set input"/>
<connect from_op="Generate Sum" from_port="example set output" to_op="Select Sum" to_port="example set input"/>
<connect from_op="Select Sum" from_port="example set output" to_op="Parse Numbers (2)" to_port="example set input"/>
<connect from_op="Parse Numbers (2)" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Rename Label" to_port="example set input"/>
<connect from_op="Rename Label" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="series:windowing" compatibility="5.3.000" expanded="true" height="76" name="Win 3 2" width="90" x="187" y="32">
<parameter key="window_size" value="3"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="label"/>
<parameter key="horizon" value="2"/>
</operator>
<operator activated="true" class="multiply" compatibility="5.3.008" expanded="true" height="94" name="Multiply" width="90" x="309" y="34"/>
<operator activated="true" class="series:sliding_window_validation" compatibility="5.3.000" expanded="true" height="112" name="Validation" width="90" x="515" y="30">
<parameter key="training_window_width" value="15"/>
<parameter key="test_window_width" value="1"/>
<parameter key="horizon" value="2"/>
<parameter key="average_performances_only" value="false"/>
<process expanded="true">
<operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance VM (2)" width="90" x="152" y="50"/>
<connect from_port="training" to_op="Relevance VM (2)" to_port="training set"/>
<connect from_op="Relevance VM (2)" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="5.3.008" expanded="true" height="76" name="Apply Model" width="90" x="91" y="12">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_regression" compatibility="5.3.008" expanded="true" height="76" name="Performance" width="90" x="282" y="61">
<parameter key="root_mean_squared_error" value="false"/>
<parameter key="absolute_error" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="series:predict_series" compatibility="5.3.000" expanded="true" height="60" name="Predict: 22 5 22" width="90" x="78" y="331">
<parameter key="window_width" value="15"/>
<parameter key="horizon" value="2"/>
<parameter key="max_training_set_size" value="15"/>
<process expanded="true">
<operator activated="true" class="relevance_vector_machine" compatibility="5.3.008" expanded="true" height="76" name="Relevance VM" width="90" x="412" y="29"/>
<connect from_port="window example set" to_op="Relevance VM" to_port="training set"/>
<connect from_op="Relevance VM" from_port="model" to_port="prediction model"/>
<portSpacing port="source_window example set" spacing="0"/>
<portSpacing port="sink_prediction model" spacing="0"/>
</process>
</operator>
<operator activated="true" class="rename" compatibility="5.3.008" expanded="true" height="76" name="Rename" width="90" x="263" y="330">
<parameter key="old_name" value="prediction(label)"/>
<parameter key="new_name" value="pred"/>
<list key="rename_additional_attributes"/>
</operator>
<operator activated="true" class="generate_attributes" compatibility="5.3.008" expanded="true" height="76" name="Generate Attributes" width="90" x="439" y="335">
<list key="function_descriptions">
<parameter key="abs_pred_minus_label" value="abs(pred-label)"/>
</list>
</operator>
<operator activated="true" class="extract_performance" compatibility="5.3.008" expanded="true" height="76" name="Performance (2)" width="90" x="657" y="349">
<parameter key="performance_type" value="statistics"/>
<parameter key="attribute_name" value="abs_pred_minus_label"/>
</operator>
<connect from_op="Generate Data (6)" from_port="out 1" to_op="Win 3 2" to_port="example set input"/>
<connect from_op="Win 3 2" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Validation" to_port="training"/>
<connect from_op="Multiply" from_port="output 2" to_op="Predict: 22 5 22" to_port="example set"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
<connect from_op="Predict: 22 5 22" from_port="example set" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Performance (2)" to_port="example set"/>
<connect from_op="Performance (2)" from_port="performance" to_port="result 2"/>
<connect from_op="Performance (2)" from_port="example set" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
0 -
Dear Wessel!
Thank you so much for your answer. Due to the fact that I'm a beginner I don't know how to import your data as a new operator into my process of video 8 to 10 & I'm not sure at which position of the chain to position this operator then.
Best regards, Dai Wizard!
0 -
Click view.
Create new perspective.
In show view, tick XML, untick all others.
In XML tab:
Paste XML code
Click green V symbol.
Return to your standard view.0 -
Hi!
Thank you wessel for your tips but I'm afraid it looks too complicated for me, I think I cannot handle (understand) it completely. Therefore I've created a PDF - file that you could view using this link: http://www.professor-heusenstamm.com/model.pdf
Bild 1 shows my original process, Bild 2 is the content of the validation operator.
Bild 3 shows the general performance output.
Bild 4 is my latest progress :-) I've inserted the "Log - Operator" and defined here the values for performance and prediction accuracy.
Bild 5 shows the result of the latter.
My question is: Did I insert the Log - operator at the correct position in the process (Bild4) to be sure it delivers the performance of the predicted n+1 value, that's content of "Read Excel (2)" or do I have to rearrange / add something ???
As usual I'm looking forward to anybodies comments.
0 -
My process (I call this process not model) looks like this:
http://i.snag.gy/STABy.jpg
I used this button to create a new perspective (I named this perspective XML):
http://i.snag.gy/A53kc.jpg
So now my screen looks like:
http://i.snag.gy/6QXgV.jpg
This is easy for sharing processes.1