"Calculate confidence interval of RMSE"
wessel
New Altair Community Member
Dear All,
I have two forecasting algorithms that output some forecast for the temperature 24 hours a head in time.
Algorithm A uses 1-nearest neighbours.
Algorithm B is a baseline algorithm, and simply outputs the last known temperature value as a prediction.
Lets say I calculate the Mean Squared Error, and the Variance of the Squared Error for A and B on a separate test set with N data points.
Then what is the confidence interval of MSE_A?
And what is the confidence interval of MSE_B?
Best regards,
Wessel
I have two forecasting algorithms that output some forecast for the temperature 24 hours a head in time.
Algorithm A uses 1-nearest neighbours.
Algorithm B is a baseline algorithm, and simply outputs the last known temperature value as a prediction.
Lets say I calculate the Mean Squared Error, and the Variance of the Squared Error for A and B on a separate test set with N data points.
Then what is the confidence interval of MSE_A?
And what is the confidence interval of MSE_B?
Best regards,
Wessel
Tagged:
0
Answers
-
I have solved this problem as following, although I'm not sure it is correct:
diffErrMean = baseErrMean - predErrMean;
diffVarMean = baseVarMean + predVarMean;
varOverSqrtN = diffVarMean / Math.sqrt(N);
z = diffErrMean / varOverSqrtN;
z = Math.abs(z);
upper = diffErrMean + z * diffVarMean
lower = diffErrMean - z * diffVarMean
(Where B = baseline = baseErrMean, and A = algorithm = predErrMean)
I can then print something like:
N: 13 // number of test points
Target: "temp"
Run time: 0.105 ms
predErrMean: 0.134 predVarMean: 0.067
baseErrMean: 0.246 baseVarMean: 0.141
diffErrMean: 0.113 +- 0.058 = [-0.003, 0.228] // kinda weird that this is already nearly significant with only 13 test points
Ratio: 1.8430 -
Okay this does not make any sense.
You need to use the CDF of the T distribution to convert the z at the 2.5% point.
But this is hard in Java since there is no easy access to the CDF of the T distribution.
So for now I think I will assume the normal distribution and use confidence interval = MEAN +- 2 * S.D.
But then the problem is:
The differences are not normally distributed.
The maximum difference possible with algorithm A 0 error, and the baseline some big error, then the big error would be equal to the S.D.
And nothing would ever be significant.0