LibSVM - Probability Estimation - Meaning of the Parameters

mbuko
mbuko New Altair Community Member
edited November 2024 in Community Q&A
Hi,

LibSVM provides within RapidMiner the output of the probability based on Platt scaling (https://rapid-i.com/wiki/index.php?title=Support_Vector_Machine_(LibSVM)). I don't understand the meaning of the documented parameters and the resulting confidence values:

1) Does "calculate confidences" = TRUE indicate, that the confidence values are the probability estimations?
2) What is the definition of the confidence value, if "calculate confidences" = FALSE?
3) In a multiclass case how does "confidence for multiclass" = TRUE and how does "confidence for multiclass" = FALSE behave (also in the combination with "calculate confidences" = TRUE or FALSE)?

In order to use RapidMiner and the output, I need a clear understanding of the parameters, confidence values and dependencies to the RapidMiner operators.

I would be pleased, if you would help me with these issues!

Best regards,
Mark
Tagged:

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,

    if you classify using a SVM you can either just calculate the sgn of the scalar product, and thus get the prediction or do the full scalar product and get also the margin (?). This value can be interpreted as a confidence, this is not necessary a probabilty for the class.

    for 3) i think there is a 1 vs all in the background and then the confidences are "simply" normalized? Not sure exactly though.

    ~Martin
  • mbuko
    mbuko New Altair Community Member
    @Martin thank you for your reply!

    The documentation of RapidMiner says that LibSVM provides the probability estimation based on Platt scaling, but which parameter enables it? It is neither clearly descriped nor defined, it is really inadequate for scientifc work.



  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    on an unrelated note: how did you find that hugely outdated link?
    I suggest you using docs.rapidminer.com in the future.

    Cheers,
    Marco
  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,

    i think it is activated by default.

    Have a look at LivSVMModel.java line 252, this calculates it. I am still not sure, what the if cause

    if (model.probA != null && model.probB != null) {
    (line 220)
    means, which is causes the activation.

    ~Martin
  • mbuko
    mbuko New Altair Community Member
    @Martin Thanks!

    I think I have to look into the implemention: LibSVMModel.java and libsvm/Svm.java seem to contain all relevant information. With the first brief look I have found some interesting parts:

    Svm.java:

    - svm_train
    - svm_svr_probability
    - svm_cross_validation
    - svm_predict_probability
    - sigmoid_predict
    - multiclass_probability // Method 2 from the multiclass_prob paper by Wu, Lin, and Weng
    - svm_predict
    - svm_predict_values
    - svm_binary_svc_probability
    - sigmoid_train (Platt's binary SVM Probablistic Output: an improvement from Lin et al.

    LibSVMModel.java:

    - performPrediction
    - if model.param.svm_type == LibSVMLearner.SVM_TYPE_ONE_CLASS
    - double maxConfidence = Double.NEGATIVE_INFINITY; double minConfidence = Double.POSITIVE_INFINITY;
    - Svm.svm_predict_values
    - else performing regular classification or regression
    - if (model.probA != null && model.probB != null) {
    - Svm.svm_predict_values
    - Svm.sigmoid_predict
    - Svm.multiclass_probability(nr_class, pairwise_prob, classProbs);

    for (k = 0; k < nr_class; k++) {
    example.setValue(confidenceAttributes, classProbs);
    }
    - else
    - predictedClass = Svm.svm_predict
    - Svm.svm_predict_value
    - prediction = functionValues[0];
    - example.setValue(confidenceAttributes[0], 1.0d / (1.0d + java.lang.Math.exp(-prediction)));

    I have to analyze it more in detail. I would be very pleased about any support.