"svm prediction module guidence"

rm_stallion
rm_stallion New Altair Community Member
edited November 5 in Altair RapidMiner
Hi,

I am doing an academic project on stock prediction. while trying to figure out how SVM works, i bumped into rapid miner. I am using it since last 2 hours and i am not able to figure out how to predict values for future dates (horizon > 1). I increased the horizon size but then it shows me 1 future value for every value in input data (if horizon is 5, it shows me 1 value for every input which is suposed to be a predicted value on 5th day after current input). Is there any way by which i can display future values in proper sequence e.g. day 1 -  predicted value 1, day 2 - predicted value 2, etc.
also, is there any way by which I can improve the prediction accuracy ?????
also, can i somehow incorporate such a particular prediction module in my java code for my GUI or should i call rapid miner explicitly from my java program ??? (i just want to use the SVm prediction module and not all the features of rapid miner)
It would be great if you can help me out

I am attaching here the XML of my test file


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
 <context>
   <input>
     <location/>
   </input>
   <output>
     <location/>
     <location/>
     <location/>
     <location/>
   </output>
   <macros/>
 </context>
 <operator activated="true" class="process" expanded="true" name="Process">
   <process expanded="true" height="423" width="763">
     <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV" width="90" x="45" y="30">
       <parameter key="file_name" value="C:\Users\Rj\Downloads\train.csv"/>
     </operator>
     <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
       <parameter key="name" value="1"/>
       <parameter key="target_role" value="id"/>
     </operator>
     <operator activated="true" class="series:windowing" expanded="true" height="76" name="Windowing" width="90" x="313" y="30">
       <parameter key="horizon" value="5"/>
       <parameter key="window_size" value="1"/>
       <parameter key="create_label" value="true"/>
       <parameter key="label_attribute" value="564.08"/>
     </operator>
     <operator activated="true" class="series:sliding_window_validation" expanded="true" height="112" name="Validation" width="90" x="447" y="30">
       <parameter key="training_window_width" value="5"/>
       <parameter key="training_window_step_size" value="1"/>
       <parameter key="test_window_width" value="5"/>
       <process expanded="true">
         <operator activated="true" class="nominal_to_numerical" expanded="true" height="94" name="Nominal to Numerical" width="90" x="45" y="255"/>
         <operator activated="true" class="support_vector_machine" expanded="true" height="112" name="SVM" width="90" x="112" y="75">
           <parameter key="kernel_degree" value="5.0"/>
           <parameter key="C" value="1.0"/>
         </operator>
         <connect from_port="training" to_op="Nominal to Numerical" to_port="example set input"/>
         <connect from_op="Nominal to Numerical" from_port="example set output" to_op="SVM" to_port="training set"/>
         <connect from_op="SVM" from_port="model" to_port="model"/>
         <portSpacing port="source_training" spacing="0"/>
         <portSpacing port="sink_model" spacing="0"/>
         <portSpacing port="sink_through 1" spacing="0"/>
       </process>
       <process expanded="true">
         <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="66" y="30">
           <list key="application_parameters"/>
         </operator>
         <operator activated="true" class="series:forecasting_performance" expanded="true" height="76" name="Performance" width="90" x="195" y="25">
           <parameter key="horizon" value="1"/>
         </operator>
         <connect from_port="model" to_op="Apply Model" to_port="model"/>
         <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
         <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
         <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
         <portSpacing port="source_model" spacing="0"/>
         <portSpacing port="source_test set" spacing="0"/>
         <portSpacing port="source_through 1" spacing="0"/>
         <portSpacing port="sink_averagable 1" spacing="0"/>
         <portSpacing port="sink_averagable 2" spacing="0"/>
       </process>
     </operator>
     <operator activated="true" class="read_csv" expanded="true" height="60" name="Read CSV (2)" width="90" x="45" y="255">
       <parameter key="file_name" value="C:\Users\Rj\Downloads\test.csv"/>
     </operator>
     <operator activated="true" class="set_role" expanded="true" height="76" name="Set Role (2)" width="90" x="179" y="255">
       <parameter key="name" value="1"/>
       <parameter key="target_role" value="id"/>
     </operator>
     <operator activated="true" class="series:windowing" expanded="true" height="76" name="Windowing (2)" width="90" x="313" y="255">
       <parameter key="window_size" value="1"/>
       <parameter key="label_attribute" value="562.21"/>
     </operator>
     <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model (2)" width="90" x="492" y="261">
       <list key="application_parameters"/>
     </operator>
     <connect from_op="Read CSV" from_port="output" to_op="Set Role" to_port="example set input"/>
     <connect from_op="Set Role" from_port="example set output" to_op="Windowing" to_port="example set input"/>
     <connect from_op="Windowing" from_port="example set output" to_op="Validation" to_port="training"/>
     <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
     <connect from_op="Validation" from_port="training" to_port="result 1"/>
     <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
     <connect from_op="Read CSV (2)" from_port="output" to_op="Set Role (2)" to_port="example set input"/>
     <connect from_op="Set Role (2)" from_port="example set output" to_op="Windowing (2)" to_port="example set input"/>
     <connect from_op="Windowing (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
     <connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 3"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
     <portSpacing port="sink_result 3" spacing="0"/>
     <portSpacing port="sink_result 4" spacing="0"/>
   </process>
 </operator>
</process>


csv files i used had following data
train.csv
1 GOOG 564.08 564.78 561.01 565.18
2 GOOG 562.48 564.78 561.01 565.18
3 GOOG 562.76 559.46 558.71 564.66
4 GOOG 562.3 559.46 558.71 564.66
5 GOOG 562.17 559.46 558.71 564.66
6 GOOG 562.08 559.46 558.71 564.66
7 GOOG 561.658 559.46 558.71 564.66
8 GOOG 561.52 559.46 558.71 564.66
9 GOOG 560.548 559.46 558.71 564.66
10 GOOG 560.19 559.46 556.5 564.66
11 GOOG 562.77 563.75 562.4 564.22
12 GOOG 564.95 563.75 562.21 565.85
13 GOOG 566.87 563.75 562.21 568
14 GOOG 571.01 563.75 562.21 571.22
15 GOOG 571.89 563.75 562.21 571.909
16 GOOG 570.8115 563.75 562.21 572
17 GOOG 567.34 563.75 562.21 572
18 GOOG 569.2 563.75 562.21 572
19 GOOG 570.73 563.75 562.21 572
20 GOOG 570.13 563.75 562.21 572
21 GOOG 572.16 563.75 562.21 572.2

test.csv
1 GOOG 575.22 563.75 562.21 575.25
2 GOOG 575.16 563.75 562.21 578.5

I wanted to predict future values (for next 10 days) using input from test.csv. Is there any way by which I can predict all 10 values (with as high accuracy as possible) and display them too ???
Tagged: