Altair RISE

A program to recognize and reward our most engaged community members

Nominate Yourself Now!

ARIMA parameter configuration p, q, d

hi,

I am fairly new to data science and exploring time-series. I'm currently trying the ARIMA model but notice there is a big difference in the outcome of the model by configuring the p, q and d parameters. Is there anyone who can explain in simple words what each parameter means and how I can come up with the best configuration? Or should I use the default and use a parameter optimization?

I hope someone can share his/her experience.
Thanks,
Bart

Find more posts tagged with

AI Studio

ARIMA

Accepted answers

MartinLiebig

Hi @Barclaeys ,

p and q are basically how far the model can look back. Keep i mind that ARIMA has three parts (Auto-Regressive, Integral and Moving Average). p is the look back for the AR part, q for the MA part. If you set for example p=1 and q=0, then your model will only be auto-regressive and only consider the last data point.

On d: that controls the I part of ARIMA. In layman terms: ARIMA can forecast not just the time series itself, but alternatively it's derivative (and then later integrate again). if you set d=0 you forecast the original series, d=1 the 1st derivative and so on. I usually only try d=0 and d=1.

Best,

Martin

All comments

MartinLiebig

Hi @Barclaeys ,

Best,

Martin

Barclaeys

Martin, once more thanks for your feedback. Is my understanding correct that to determine the best setting for the auto-regression, I should run an ACF on my data and check for how many lags I still see a specific correlation? And If so, is there anything similar that I can run for the MA part?
Thanks, Bart

MartinLiebig

Hi @Barclaeys ,

i think there are in general two schools of thought here, when it comes down to hyper parameter settings

The Statisticians Way: Analyze the data and check what the right parameters of the algorithms should be. For example with ACF, but also other methods. For ARIMA I am not sure if there is a standard test to figure it out. @David_A and @yyhuang are bigger experts on this topic.

The Data Scientists Way: Just try many p/q/d values and find the best ones by doing a proper out-of-sample test.

I am a fan of #2, but this does not mean that #1 is wrong.

Best,

Martin