Need reference for Optimize Parameters (Evolutionary)[SOLVED]
Find more posts tagged with
Sort by:
1 - 10 of
101

I don't know if there is a specific paper for this operator. It just uses an optimization technique called "Evolution Strategy" (ES) which was introduced by Rechenberg and Schwefel in the 70s as far as I know. They proposed a generalized method for optimization and the operator uses this method to find the best parameters. Nothing special about this, thus you should be fine to cite ES paper for this.
You should probably cite the PhD thesis by Thomas Back, since he is from Dortmund University.
(Although he is currently in Leiden).
http://arnetminer.org/person/thomas-back-1509429.html
Alternatively you can cite his close friend: Gusz Eiben.
http://www.cs.vu.nl/~gusz/ecbook/ecbook.html
Or, if the code is CMA-ES rather than ES, you can cite Nikolaus Hansen.
https://www.lri.fr/~hansen/cmsa-versus-cma.html
Look at the 3 links, you are guaranteed to find a good paper here.
(Although he is currently in Leiden).
http://arnetminer.org/person/thomas-back-1509429.html
Alternatively you can cite his close friend: Gusz Eiben.
http://www.cs.vu.nl/~gusz/ecbook/ecbook.html
Or, if the code is CMA-ES rather than ES, you can cite Nikolaus Hansen.
https://www.lri.fr/~hansen/cmsa-versus-cma.html
Look at the 3 links, you are guaranteed to find a good paper here.
Hi, this is of course only feasible if you are firm with java coding (at least with reading code). In this case, under the following link you'll find instructions for getting the RapidMiner project into eclipse: http://rapid-i.com/content/view/25/27/lang,en/
Ali2013 wrote:
Thanks a lot for your help
I am new in rapid miner and i do not know how to trace code. Please tell me how can i trace code during run or find optimization parameter evolutionary source code to get my answers.
With the best regards
To find the code of a certain class, open the OperatorsDoc.xml, search for the operator name, and search for the respective key in Operators.xml, which will then point you to the underlying java class.
Best regards,
Marius
http://rapid-i.com/api/rapidminer-5.1/com/rapidminer/operator/learner/functions/kernel/evosvm/EvoSVM.html
This is a SVM implementation using an evolutionary algorithm (ES) to solve the dual optimization problem of a SVM. It turns out that on many datasets this simple implementation is as fast and accurate as the usual SVM implementations. In addition, it is also capable of learning with Kernels which are not positive semi-definite and can also be used for multi-objective learning which makes the selection of C unecessary before learning.
Mierswa, Ingo. Evolutionary Learning with Kernels: A Generic Solution for Large Margin Problems. In Proc. of the Genetic and Evolutionary Computation Conference (GECCO 2006), 2006.
http://dl.acm.org/citation.cfm?id=1144249
Evolutionary learning with kernels: a generic solution for large margin problems
Full Text: PDFPDF
Author: Ingo Mierswa University of Dortmund
Published in:
· Proceeding
GECCO '06 Proceedings of the 8th annual conference on Genetic and evolutionary computation
Pages 1553-1560
ACM New York, NY, USA ©2006
table of contents ISBN:1-59593-186-4 doi>10.1145/1143997.1144249
http://wing2.ddns.comp.nus.edu.sg/downloads/keyphraseCorpus/89/89.pdf
This is a SVM implementation using an evolutionary algorithm (ES) to solve the dual optimization problem of a SVM. It turns out that on many datasets this simple implementation is as fast and accurate as the usual SVM implementations. In addition, it is also capable of learning with Kernels which are not positive semi-definite and can also be used for multi-objective learning which makes the selection of C unecessary before learning.
Mierswa, Ingo. Evolutionary Learning with Kernels: A Generic Solution for Large Margin Problems. In Proc. of the Genetic and Evolutionary Computation Conference (GECCO 2006), 2006.
http://dl.acm.org/citation.cfm?id=1144249
Evolutionary learning with kernels: a generic solution for large margin problems
Full Text: PDFPDF
Author: Ingo Mierswa University of Dortmund
Published in:
· Proceeding
GECCO '06 Proceedings of the 8th annual conference on Genetic and evolutionary computation
Pages 1553-1560
ACM New York, NY, USA ©2006
table of contents ISBN:1-59593-186-4 doi>10.1145/1143997.1144249
http://wing2.ddns.comp.nus.edu.sg/downloads/keyphraseCorpus/89/89.pdf
If you read this entire paper, can you tell me what all this variables mean?
http://i.snag.gy/MlUz8.jpg

As far as I understand:
a: constrained real values
n: number of support vectors (a)
y: labels (with values -1 and 1)
k(.,.): a kernel function
So how is n chosen?
Is this maybe the number of data points?
So now I should be able to understand fully what this formula does.
We have a double loop, so we get all possible combinations of two data points in our data set.
y_i * y_j gets a value of 1 when both data points are of the same class and a value -1 when they are of a different class
a_i * a_j * k(x_i, x_j) also evaluates to a scalar value
Since we maximizing, we want a_i * a_j * k(x_i, x_j) to evaluate to some positive value if they are the same class, and some negative value if they are not the same class.
k(x_i, x_j) maybe is possible to interpret as the notion of similarity between this two data points with very similar is high positive, and very non-similar is high negative
Is pretty clear to me that I don't fully understand what is going on here.
For me optimizing the a's that maximize this formula using ES is trivial, but why this formula optimizing margin is unclear to me.
After the paper mentions "Wolfe dual" I'm lost, but I would like to understand!
http://i.snag.gy/MlUz8.jpg

As far as I understand:
a: constrained real values
n: number of support vectors (a)
y: labels (with values -1 and 1)
k(.,.): a kernel function
So how is n chosen?
Is this maybe the number of data points?
So now I should be able to understand fully what this formula does.
We have a double loop, so we get all possible combinations of two data points in our data set.
y_i * y_j gets a value of 1 when both data points are of the same class and a value -1 when they are of a different class
a_i * a_j * k(x_i, x_j) also evaluates to a scalar value
Since we maximizing, we want a_i * a_j * k(x_i, x_j) to evaluate to some positive value if they are the same class, and some negative value if they are not the same class.
k(x_i, x_j) maybe is possible to interpret as the notion of similarity between this two data points with very similar is high positive, and very non-similar is high negative
Is pretty clear to me that I don't fully understand what is going on here.
For me optimizing the a's that maximize this formula using ES is trivial, but why this formula optimizing margin is unclear to me.
After the paper mentions "Wolfe dual" I'm lost, but I would like to understand!
Hi,
as far as I remember n is the number of examples in the dataset. You are right about the assumptions of the other examples.
Tibshirani's Elements of Statistical Learning contain a good introduction and mathematical derivation of the SVM and the formula you cite: http://www-stat.stanford.edu/~tibs/ElemStatLearn/index.html
Best regards,
Marius
as far as I remember n is the number of examples in the dataset. You are right about the assumptions of the other examples.
Tibshirani's Elements of Statistical Learning contain a good introduction and mathematical derivation of the SVM and the formula you cite: http://www-stat.stanford.edu/~tibs/ElemStatLearn/index.html
Best regards,
Marius