Data Preprocessing Ideas

I am working with a dataset that is relatively clean, it has no missing values and most of the attributes are numeric with one being a date-time stamp of every 30 mins. I need to carry out some pre-processing techniques on it and have the below ideas but am also looking for other suggestions. Thanks.
- Rename some of the numeric attributes so they are easier to identify
- Set roles
Ultimately I will build a model to predict the temperature using regression models and the date-time stamp. This will be trained and then tested.
Answers
-
Perhaps windowing the time data. Or having a column to show if the numeric value is higher or lower than the value 30 minutes previously?
0 -
Hi Edward,
Thanks for the feedback. I am pretty new to RM. Can you explain a little more on how windowing works? Does the time-date attribute need to have the role of label? Thanks.
0 -
Hi Sammie,
first of all you need the time series extension. You can find it in the marketplace (in the menu Extensions -> Marketplace). Try to experiment with the operators and their tutorials.
I think that your question is more about Feature Generation than about RapidMiner. You will probably need to consult some Time Series literature. I can recommend the following:
Helmut Lütkepohl-New Introduction To Multiple Time Series Analysis-Springer (2006)
Kind regards,
Sebastian
1