Heuristic for choice of parameter C in LibSVM
Dann
New Altair Community Member
Hi there,
I was trying to find more info on the heuristic choice of the parameter C used by RapidMiner (i.e. the one used when the parameter is given as 0). However, the only place I could find mentioning this heuristic was this old thread from this forum. Is there any other place where this heuristic is discussed? I could not find the references cited by Ingo on that thread. Hastie and Tibshirani have far too many publications to search them one by one. Perhaps someone could give me any directions for justifying this heuristic value?
Cheers,
Dann
I was trying to find more info on the heuristic choice of the parameter C used by RapidMiner (i.e. the one used when the parameter is given as 0). However, the only place I could find mentioning this heuristic was this old thread from this forum. Is there any other place where this heuristic is discussed? I could not find the references cited by Ingo on that thread. Hastie and Tibshirani have far too many publications to search them one by one. Perhaps someone could give me any directions for justifying this heuristic value?
Cheers,
Dann
Tagged:
0
Answers
-
Maybe try this:
Cherkassky: Practical Selection of SVM Parameters and Noise Estimation for SVM Regression
I would be happy if you let me know if it worked...
Milan0 -
Well, this paper deals explicitly with the problem of SVM regression. I suspect RapidMiner uses this heuristic for classification as well. I am just looking for the source of this information, so I could properly quote it in a scientific publication.
Cheers,
Dann0 -
-
Hello Haddock,
Thanks for replying, but what exactly do you mean by posting a link to the guide by Hsu, Chang and Lin? I could have overlooked it, but I didn't find any reference about the heuristic used by RapidMiner on it. Would you care to explicitly quote where they mention the heuristic described by Ingo Mierswa on this thread?
Regards,
Dann.0 -
According to the RapidMIner source distribution libSVM is ...
And the help for the operator points to them, so I figured that their guide to setting parameters for their library might be relevant.Copyright (c) 2000-2005 Chih-Chung Chang and Chih-Jen Lin
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither name of copyright holders nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
0 -
Hi Haddock,
I am aware that the LibSVM library is written by Chang and Lin. But actually, the heuristic parameter seems to be set by RapidMiner, not by LibSVM. At least this was the impression I had by reading Ingo's answer. Sorry for not making this clear. Thanks for the reply, by the way.
Regards,
Dann0 -
Hi,
Reading that thread myself I found it unclear, but figured that whetever RapidMiner does it must depend on the properties of the underlying functions. If you search this forum you'll see that even the meaning of 'C' in SVMs gets vague, so zero 'C' ....
That's why I like Association Rules for booleans!0 -
Hi Haddock,
Thanks for the observation. In either case, I am starting to thing that this heuristic arises from the Wahba, Lin, and Zhang bounds on the leave-one-out cross-validation error. A paper by Olivier Chappele [1] states that the Wahba, Lin, and Zhang bound is such that T = sum( a_i k'(x_i, x_i ) ), in which k' is assumed to be an augmented kernel function with the parameter C embedded within it so that k'(x_i, x_i) = k(x_i, x_i) + 1/c. All sums are from 1 to n, in which n is the number of samples in the data set.
Now, if we go on and substitute k' with the aforementioned relation, I guess we can arrive on
T = sum( a_i k(x_i, x_i) ) + sum(a_i 1/c).
Since we would like to find the minimum of T, I suppose a straightforward strategy would be to assume that T is dependent on the Lagrange multipliers a_i, derive in respect to a and impose stationarity. This would lead to:
del T / del a = sum( k(x_i, x_i) ) + sum (1/c) = sum( k(x_i, x_i) ) + n/c = 0
And with a little algebra this can result in expression for C very similar to the heuristic suggestion as C = n / sum(K(x_i, x_i), albeit with the sign for C inverted. Something else may still be missing.
Cheers,
Dann
[1] http://research.microsoft.com/en-us/um/people/manik/projects/trade-off/papers/ChapelleML02.pdf0