"Time Series questions about window size,step size and horizon"
I have been learning to solve time series questions in Rapidminer, already read @Thomas_Ott several replys before and the Financial Market Model class on the blog website, got those basic template. still confused on how to get the right value on window size,step size and horizon.
Here are the questions, really appreciate a lot for some tips on it !
1. Is it better that just setting the window size equal to length of cyle? like 7 for data in week period, 12 for data in year period.
2.How to set the training and test window width in Sliding Validatoin? still equal to the length of period?
3.How to use RMSE to evaluate performance in Rapidminer.
4.If my origin example set contains both nominal and numeric data, should I do any transformation ? which model could work better for that circumstance,SVM?ANN?Polynominal?
Great Thanks!
Best Answer
-
Hey @rpleaner, all great questions and not to sound ambivalent but it really depends on your domain expertise. I would always use OPtimize Parameters to tune Window Size, Training/Testing Size, etc once you used your initial assumptions.
To answer your qestions:
1. Yes, it's a good start. If you're analyzing a trading financial series like a stock, then look at 5 day windows (one trading week) or if the data is in hours, maybe an 8 hour window. Remember, the window creates a 'section' of data. So if you were interested into know whether the trend would end up or down, then you 5 day window would have all the attributes from Fri, Mon, Tues, Wed, Thurs.
2. Probably a good start as well but I would first look at things like weekly or monthly time periods for training and then one week or one month for testing. These values should be tuned in Optimizations. You'll be surprised on what kind of performance increase you can get doing this.
3. You could use the Regression Performance operator instead of the Forecast Performance one. That will get you RSME. When you Optimize Parameters, make sure to select RSME as the value you want to optmize around.
4. You're going to have to do some coversions for sure of you want to SVM and have polynominal attributes. Each machine learning algo is different.
Have you checked out this blog post of mine? http://www.neuralmarkettrends.com/predicting-historical-volatility-for-the-sp500/
5
Answers
-
Hey @rpleaner, all great questions and not to sound ambivalent but it really depends on your domain expertise. I would always use OPtimize Parameters to tune Window Size, Training/Testing Size, etc once you used your initial assumptions.
To answer your qestions:
1. Yes, it's a good start. If you're analyzing a trading financial series like a stock, then look at 5 day windows (one trading week) or if the data is in hours, maybe an 8 hour window. Remember, the window creates a 'section' of data. So if you were interested into know whether the trend would end up or down, then you 5 day window would have all the attributes from Fri, Mon, Tues, Wed, Thurs.
2. Probably a good start as well but I would first look at things like weekly or monthly time periods for training and then one week or one month for testing. These values should be tuned in Optimizations. You'll be surprised on what kind of performance increase you can get doing this.
3. You could use the Regression Performance operator instead of the Forecast Performance one. That will get you RSME. When you Optimize Parameters, make sure to select RSME as the value you want to optmize around.
4. You're going to have to do some coversions for sure of you want to SVM and have polynominal attributes. Each machine learning algo is different.
Have you checked out this blog post of mine? http://www.neuralmarkettrends.com/predicting-historical-volatility-for-the-sp500/
5