Cross Validation

Question

Hello! I made for educational purposes a predictive network in the form of a video lesson from Thomas Ott for stock market price prediction. My question is, after the trained network, when it is cross validate, should I assign in the operator Set Set the Attribute Name fields as prediction (label) and the target as prediction? Or do I put the attribute as a prediction (label) and the target role as a label?

Thomas_Ott · Accepted Answer

Oh I'm glad you're building off my old tutorial. I really need to update them one day. :)

For the scoring set, the data you want to predict a label for, you normally don''t include it. So there will be no column for it. When it gets scored, it will automatically create a new column as well as a column for each label class. In the my videos that would've been a column for label, confidence(up), and confidence(down).

Thomas_Ott · Accepted Answer

So with Cross Validation, depending on the "k" parameter, it will cut your your data set into "k" groups. it will train on "k-1" and test on 1, then it will another "k-1" group at random and train in that, test on 1. It will do this k times and then train the model on the whole data set. The accuracy value is the average of "k" data sets with 1 std deviation. This gives you an idea of how stable your model is and what you can reasonable expect from unseen data.

The Sliding Window Validation operator works different, it's like backtest. The Window widths are the time periods you want to use for training and so forth.