forecast SVR

sarah_mi88
sarah_mi88 New Altair Community Member
edited November 2024 in Community Q&A
Hi everyone,

I want to apply support vector regression with sales data for training from 2016-2017 and for testing from 2018 (label date). My aim is to see the forecast value for the next 4 periods. But operator "apply forecast" doesn't work and operator "Performance (Regression)" doesn't evaluate labels of type date. For parameters choosen see screenshot below. If any data are missing, pls comment. What do I have to do??

Thx and cheers,
Sarah

Best Answer

  • varunm1
    varunm1 New Altair Community Member
    edited November 2019 Answer ✓
    But how do I find "good" values for gamma, C, epsilon/nu and p? (nu-SVR or epsilon-SVR, I want to do regression). What is common practice? Doing CV? But how?
    We use "optimize parameters (Grid)" operators to search optimal hyperparameters for a model (SVM in this case). CV is only for validation purposes and doesn't provide any optimal parameters.

    In your process, I see "Datum" (I think date) is set as a label and one more "Aufzugstechnik" is also set as a label. A prediction model can only take one label attribute, In your case, it should be "Aufzugstechnik" I guess.

    Is your data set time-dependent (time series)? If so, regular cross-validation is not good as it fails in time series backtesting. You should choose the "Sliding window Validation" method.

    Here is a link that helps you understand the time-series process in rapidminer

    https://rapidminer.com/resource/time-series-analysis/

    I attached a modified process, as I don't have your datasets, I did some modifications, you need to add windowing based on your dataset.

    You can also see how to do "Optimize parameters" for SVM hyperparameters inside this sliding window validation operator.

Answers

  • varunm1
    varunm1 New Altair Community Member
    Hello @sarah_mi88

    It says that you set a column with date data type as label column. Did you set that? Can you provide .rmp file (File --> Export Process) and dara for us to debug?
  • sarah_mi88
    sarah_mi88 New Altair Community Member
    Hello @varunm1

    thanks for your help! I specified the label and get now the prediction values. But how do I find "good" values for gamma, C, epsilon/nu and p? (nu-SVR or epsilon-SVR, I want to do regression). What is common practice? Doing CV? But how? See .rmp file in attachment. Currently the prediction doesn't include trend, seasonality; the predicted value is the same for the whole test interval.

  • varunm1
    varunm1 New Altair Community Member
    edited November 2019 Answer ✓
    But how do I find "good" values for gamma, C, epsilon/nu and p? (nu-SVR or epsilon-SVR, I want to do regression). What is common practice? Doing CV? But how?
    We use "optimize parameters (Grid)" operators to search optimal hyperparameters for a model (SVM in this case). CV is only for validation purposes and doesn't provide any optimal parameters.

    In your process, I see "Datum" (I think date) is set as a label and one more "Aufzugstechnik" is also set as a label. A prediction model can only take one label attribute, In your case, it should be "Aufzugstechnik" I guess.

    Is your data set time-dependent (time series)? If so, regular cross-validation is not good as it fails in time series backtesting. You should choose the "Sliding window Validation" method.

    Here is a link that helps you understand the time-series process in rapidminer

    https://rapidminer.com/resource/time-series-analysis/

    I attached a modified process, as I don't have your datasets, I did some modifications, you need to add windowing based on your dataset.

    You can also see how to do "Optimize parameters" for SVM hyperparameters inside this sliding window validation operator.

  • sarah_mi88
    sarah_mi88 New Altair Community Member
    Thank you so much. Really appreciating it. Besides I get this error. Can you help me with that too? (attached xlsx)
  • varunm1
    varunm1 New Altair Community Member
    Hello @sarah_mi88

    This error comes when your dataset has irregular information in the date column. For time series, you need to have a monotonically increasing date column (you should not have the same date and time repeating in your dataset).

    Based on the dataset you gave (very small dataset). Please find the working process. 
  • sarah_mi88
    sarah_mi88 New Altair Community Member
    Hello Varun

    ok. Sorry for the stupid question but why is the value always the same ? (no trend, saisonality,  same prediction for Q1-4)
  • varunm1
    varunm1 New Altair Community Member
    I observed that the model is doing worse. If you see the squared correlation value from the performance it is zero which means the model is not at all good. This may be due to fewer data in your dataset (7 examples is very small). Try simple models like GLM and see how it goes, you can also look at time series models like ARIMA.