Dear all,
in order to predict time series I see two possible approaches. As I am not that experienced yet I would like to discuss the advantages and disadvantages with you. For prediction tasks my general assumption is that attributes are needed which are correlated in some way with the label. However, especially in time series it may occur that the maximal correlation is lagged.
Example data set:
label att1 att2
5 1 7
6 2 8
7 3 9
8 4 10
9 5 11
Question 1)
I would assume that in the given example only att2 contributes in predicting the label more than att1 as its course is ahead of the label. Would you agree on that, even if att1 shows the same development in course of time?
Approach 1)
I could imagine to compare label and attribute one by one for each single attribute and each lag (i.e. correlation of label vs att1-0, label vs att1-1, label vs att1-2, label vs. att2-0, ...). Then I would take only those attributes / lags that provide the best correlation and feed it into a learner for prediction.
Approach 2)
I could use windowing operator to create several lags for every attribute in one data set (i.e. label, att1-0, att1-1, att1-2, att2-0, ... in one set). The result would be a data set with lots of attributes that I can then feed into the learner again, assuming that the learner itself decides on the attributes that describe the label best.
(Think that is also the idea in this article
http://rapid-i.com/rapidforum/index.php/topic,200.0.html However, one has to be careful that not only lagged variants of the same attribute are considered for the model then (e.g. att1-0, att1-1, att1-2) as in this case the model would be build more or less on copies of the same data.)
Please advise which approach you favour and what your expirience with time series is like.
Best regards
Sachs