Data Preprocessing Ideas

pix123
pix123 New Altair Community Member
edited November 2024 in Community Q&A

I am working with a dataset that is relatively clean, it has no missing values and most of the attributes are numeric with one being a date-time stamp of every 30 mins. I need to carry out some pre-processing techniques on it and have the below ideas but am also looking for other suggestions. Thanks.

 

- Rename some of the numeric attributes so they are easier to identify

- Set roles

 

Ultimately I will build a model to predict the temperature using regression models and the date-time stamp. This will be trained and then tested.

Answers

  • JEdward
    JEdward New Altair Community Member

    Perhaps windowing the time data.  Or having a column to show if the numeric value is higher or lower than the value 30 minutes previously? 

     

     

     
  • pix123
    pix123 New Altair Community Member

    Hi Edward,

     

    Thanks for the feedback. I am pretty new to RM. Can you explain a little more on how windowing works? Does the time-date attribute need to have the role of label? Thanks.

     

     

  • SGolbert
    SGolbert New Altair Community Member

    Hi Sammie,

     

    first of all you need the time series extension. You can find it in the marketplace (in the menu Extensions -> Marketplace). Try to experiment with the operators and their tutorials.

     

    I think that your question is more about Feature Generation than about RapidMiner. You will probably need to consult some Time Series literature. I can recommend the following:

     

    Helmut Lütkepohl-New Introduction To Multiple Time Series Analysis-Springer (2006)

     

    Kind regards,

    Sebastian