🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Data Preprocessing Ideas

User: "pix123"
New Altair Community Member
Updated by Jocelyn

I am working with a dataset that is relatively clean, it has no missing values and most of the attributes are numeric with one being a date-time stamp of every 30 mins. I need to carry out some pre-processing techniques on it and have the below ideas but am also looking for other suggestions. Thanks.

 

- Rename some of the numeric attributes so they are easier to identify

- Set roles

 

Ultimately I will build a model to predict the temperature using regression models and the date-time stamp. This will be trained and then tested.

Find more posts tagged with

Sort by:
1 - 3 of 31
    User: "JEdward"
    New Altair Community Member

    Perhaps windowing the time data.  Or having a column to show if the numeric value is higher or lower than the value 30 minutes previously? 

     

     

     
    User: "pix123"
    New Altair Community Member
    OP

    Hi Edward,

     

    Thanks for the feedback. I am pretty new to RM. Can you explain a little more on how windowing works? Does the time-date attribute need to have the role of label? Thanks.

     

     

    User: "SGolbert"
    New Altair Community Member

    Hi Sammie,

     

    first of all you need the time series extension. You can find it in the marketplace (in the menu Extensions -> Marketplace). Try to experiment with the operators and their tutorials.

     

    I think that your question is more about Feature Generation than about RapidMiner. You will probably need to consult some Time Series literature. I can recommend the following:

     

    Helmut Lütkepohl-New Introduction To Multiple Time Series Analysis-Springer (2006)

     

    Kind regards,

    Sebastian