(Bug?) What definition of AutoCorrelation operator is in valueSeries plugin?
owen
New Altair Community Member
Hello statistical friends,
I examined the code of
[tt]rapidminer\operator\valueseries\transformations\basis\AutoCorrelation.java[/tt]
so that I could understand the meaning of the three input parameters factor, start, end.
Here is the relevant excerpt from [tt]AutoCorrelation.java[/tt] v5.3.000.
[tt]result( i ) = 2 Variance( x ) - 2 Covariance( x( j ), x( j+factor/i ) )[/tt]
.
The term "factor/i" is unfamiliar to me.
To calculate an auto-covariance function of sequence x, I would have expected to see [tt]Cov(x( j ), x( j+factor * i ))[/tt]. There, the purpose of factor is to enable user to control the computational effort by sparsely sampling the lag axis.
A few questions arise for me:
1. Is this a bug? Or is "autocorrelation transformation" something mathematically distinct from the autocovariance of the sequence?
2. Suppose the output was [tt]result(lag) = 2 Var(x) - 2 Cov( x( j ), x( j+lag ) )[/tt]. Is there a reason in machine learning why that expression is more useful than just [tt]result(lag) = Cov( x( j ), x( j+lag ) )[/tt] ?
3. Where is the public repository for ValueSeries plugin so that I can be sure that my comments are relevant to the latest code?
Thanks and regards,
Owen
I examined the code of
[tt]rapidminer\operator\valueseries\transformations\basis\AutoCorrelation.java[/tt]
so that I could understand the meaning of the three input parameters factor, start, end.
Here is the relevant excerpt from [tt]AutoCorrelation.java[/tt] v5.3.000.
for (int i = start; i < end; i++) {The function appears to calculate an estimate that converges to
double differences = 0.0d;
int numberOfValues = 0;
for (int j = 0; j < series.length(); j++) {
int lag = (int) ((double) factor / (double) i);
if ((j + lag) >= series.length())
break;
numberOfValues++;
double difference = series.getValue(j) - series.getValue(j + lag);
differences += (difference * difference);
}
differences /= numberOfValues;
displacements[i - start] = i;
result[i - start] = new Vector(differences);
}
[tt]result( i ) = 2 Variance( x ) - 2 Covariance( x( j ), x( j+factor/i ) )[/tt]
.
The term "factor/i" is unfamiliar to me.
To calculate an auto-covariance function of sequence x, I would have expected to see [tt]Cov(x( j ), x( j+factor * i ))[/tt]. There, the purpose of factor is to enable user to control the computational effort by sparsely sampling the lag axis.
A few questions arise for me:
1. Is this a bug? Or is "autocorrelation transformation" something mathematically distinct from the autocovariance of the sequence?
2. Suppose the output was [tt]result(lag) = 2 Var(x) - 2 Cov( x( j ), x( j+lag ) )[/tt]. Is there a reason in machine learning why that expression is more useful than just [tt]result(lag) = Cov( x( j ), x( j+lag ) )[/tt] ?
3. Where is the public repository for ValueSeries plugin so that I can be sure that my comments are relevant to the latest code?
Thanks and regards,
Owen
Tagged:
0