Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Algortihms are "cheating" and copying right label from other instances
sebasvog
Hi everyone,
I have a problem with my model. It should predict a monthly product volume from some given attributes.
My (training)data consists of data from ~ 60 past month. Each instance in the dataset represents one day. Two given attributes are the "month" and the "year". The label is the product volume at the end of the month. So in my case every instance of a specific month (~ 30 days/month --> ~ 30 instances) has the same label. Now when I train the algorithm (via Cross Validation / Deep Learning) and look at the performance measure (relative_error) it seems like the algorithm looks at the attributes "month" and "year" and adopts the label value from another row with the same month and year as his prediction for this instance.
I hope you can follow my description. If there is something you don't understand feel free to ask.
I would be very thankfull if someone can tell me if my guess on this is right and how I can avoid this mistake.
Now I am trying to avoid this by just having the month as an attribute, not month+year.
Thanks for your replies,
Sebastian
Find more posts tagged with
AI Studio
Labels
Data Sets
Deep Learning + Neural Nets
Predictions + Scoring
Accepted answers
All comments
MartinLiebig
Hi,
i would recommend to use a Sliding Window Validation, and not a Cross Validation. This gives you a fair estimation of the performance.
Best,
Martin
sebasvog
Hi Martin,
thank you very much for your answer. I guess this validation method could help me a lot in estimating the performance in my current model!
However I think I have to create a new process with a modified dataset (without year and month as an attribut --> maybe only month) to have a valid solution for my problem.
Regards,
Sebastian
MartinLiebig
Hi,
either that, our change the preprocessing in a way that you get the month or quarter of the year. That may help.
BR,
Martin
sebasvog
Hi,
I tried to apply "Sliding Window Validation" on my model but it seems like this type of validation is only applicable for time series data.
I know that my data is "some kind of" time series data, but I am trying to solve the problem by using a Regression with Neural Networks (Deep Learning) .
So I can not use Sliding Window Validation, right?
I tried to apply time series models (ARIMA) on my data (period=day, periode=month) but the result was very bad (quess I have not enogh historic data, just 60 month).
Regards,
Sebastian
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups