Finding Peak Times in a timeseries dataset

pix123
pix123 New Altair Community Member
edited November 2024 in Community Q&A

Hi there,

 

I am working with a dataseries that has a date-time stamp in one column. I am looking for a way to identify what are the peak times over the duration of the collected date-time stamps, is there a way to handle this in Rapidminer? If further details are needed, please let me know. Thanks.

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member

    How do you define "peak" for this purpose?  Finding a single maximum in a series is easily done using a number of different operators.  But finding "peaks" might imply some kind of underlying periodic function or a variable definition of what exactly constitutes a peak.  That kind of analysis is a bit trickier---you might want to check out the Series extension from the marketplace and look at some of the operators in there.

  • pix123
    pix123 New Altair Community Member

    Thanks for the quick reply. By peak I am referring to the time of a given day that is the highest. I am trying to determine at what times of the day usage is highest , the time has been recorded in 30 minute intervals over a 140 day period. I hope this clarifies. Is there a particular operator in the time series extension package you would recommend?

  • Telcontar120
    Telcontar120 New Altair Community Member

    It sounds like you have many separate days worth of data, so if you are looking for patterns, you can simply aggregate by time of day (if you have 30 minute intervals then you should have 48 data points per day) and then calculate the average and variance of each one---this will give you a sense of which times are more likely to be higher than others.  You can also get the minimum and maximum for each time of day to see how that compares to the average.

    However, if you are looking to identify the specific time slot on each individual day that corresponds to the maximum value for that day, the process is going to be more complex---you'll have to aggregate by each day to calculate the maximum by day, and then identify which particular timeslot matches that value.

    Neither of these processes would require the series extension, by the way.  That's more useful if you are trying to do things like calculate moving averages, do smoothing of series data, or any time series forecasting such as ARIMA.

     

     

  • sgenzer
    sgenzer
    Altair Employee

    you can also use "Generate Attributes" and create a new attribute that "gets" the hour of the timestamp. Then you can cleanly aggregate, etc...

     

    Screen Shot 2018-04-24 at 1.49.34 PM.png

     

    Scott