"Support Vector Machines - Data Input"
mburger
New Altair Community Member
Hi,
right now i try to create a simple forecasting model with svm which should be able to recognise seasonality or other influences on my, lets say, sales.
I already studied the windowing operator which splits up my training data in different time elements, like that:
variables |label
v1 v2 v3 v4 v5 |v6
v2 v3 v4 v5 v6 |v7
v3 v4 v5 v6 v7 |v8
...
The first question: Should i transform the input? If have data like 1,5,12,0,0 ... lets say, these are sales. Should i scale it between 0 and 1?
In addition i want to let my model know if something special happened. So i create a new attribute, like u which contains the information of the christmas holidays.
variables |label
v1 v2 v3 v4 v5 u1 u2 u3 u4 u5 |v6
v2 v3 v4 v5 v6 u2 u3 u4 u5 u6 |v7
e.g. v: 1,5,12,0,0 u: 1,1,1,0,0
in this case, u should tell me that the first 3 days of my time period (sales 1,5,12) where sales influenced by the christmas holidays.
Is this the right way to do it? If i scale v to between 1 and 0, and i have the binary variable u with 0 and 1, the svm can handle that well?
Next question would be the following: the christmas influence is not just 1 and 0, if i want to build a seasonal development within the
christmas sales, how i would do that? Adjust the values like 0.2, 0.3, 0.4?
And, i use a window of 5 days to predict the next day, so the window would not cover the whole season. Is there the danger that, if for
a long time there where no christmas sales, and i train my model on a regular basis, it "forgets" about christmas?
If i solved all my problems above and i have a new model which thinks about christmas and everthing could give a good hint for forecasting,
i know must tell the model for which situation it should forecast. Let's say, i have:
variables/situation |prediction
v1 v2 v3 v4 v5 u1 u2 u3 u4 u5 |f6
v2 v3 v4 v5 v6 u2 u3 u4 u5 u6 |f7
How i tell the model for f6 there will be christmas? Because with u i just describe the past?
Greetings
Martin
right now i try to create a simple forecasting model with svm which should be able to recognise seasonality or other influences on my, lets say, sales.
I already studied the windowing operator which splits up my training data in different time elements, like that:
variables |label
v1 v2 v3 v4 v5 |v6
v2 v3 v4 v5 v6 |v7
v3 v4 v5 v6 v7 |v8
...
The first question: Should i transform the input? If have data like 1,5,12,0,0 ... lets say, these are sales. Should i scale it between 0 and 1?
In addition i want to let my model know if something special happened. So i create a new attribute, like u which contains the information of the christmas holidays.
variables |label
v1 v2 v3 v4 v5 u1 u2 u3 u4 u5 |v6
v2 v3 v4 v5 v6 u2 u3 u4 u5 u6 |v7
e.g. v: 1,5,12,0,0 u: 1,1,1,0,0
in this case, u should tell me that the first 3 days of my time period (sales 1,5,12) where sales influenced by the christmas holidays.
Is this the right way to do it? If i scale v to between 1 and 0, and i have the binary variable u with 0 and 1, the svm can handle that well?
Next question would be the following: the christmas influence is not just 1 and 0, if i want to build a seasonal development within the
christmas sales, how i would do that? Adjust the values like 0.2, 0.3, 0.4?
And, i use a window of 5 days to predict the next day, so the window would not cover the whole season. Is there the danger that, if for
a long time there where no christmas sales, and i train my model on a regular basis, it "forgets" about christmas?
If i solved all my problems above and i have a new model which thinks about christmas and everthing could give a good hint for forecasting,
i know must tell the model for which situation it should forecast. Let's say, i have:
variables/situation |prediction
v1 v2 v3 v4 v5 u1 u2 u3 u4 u5 |f6
v2 v3 v4 v5 v6 u2 u3 u4 u5 u6 |f7
How i tell the model for f6 there will be christmas? Because with u i just describe the past?
Greetings
Martin
0
Answers
-
Hi,
no problem if a window does not cover a complete season, as long as your training set does. That way at least some of the input data contains christmas data, so your model can adapt to it.
About the problem of u only covering the past: you can also add values from the future. That is valid, since it is also known in the past that in december there will be christmas, so you don't add any invalid information to your training data.
To achieve that, you'll have to window u and v separately, and join it together afterwards. If you need help to implement that technically in RapidMiner, just tell us
Best regards,
Marius0