"SVM generates same value for output"

I'm trying to cluster data using the SVMCluster model from Samples/Processes/07 but only get 1 cluster from my data.  I scaled the data to percentages and want to use a windowing process to examine some customer history then predict future buying.

I changed the model to LibSVMLearner and used the genetic parameter optimizer to try to find acceptable parameters.  The output (prediction) is the same value for all examples.  When I use your SVMclustering process with LibSVM it does generate different prediction values.

Can you explain what I am doing wrong?  Do I need to use the original data instead of percentages?

Here are a modified RM sample that generates different prediction values (for simplicity I used the same data for training and testing just to get some results), my process that generates the same prediction value and some sample data.  My process also takes a very long time to read the model after training but the model file is only 35K.  Is there something wrong with model storage that could cause the problem?

description and two process examples

simple example adapted from 07_EvolutionaryParameterOptimization

[ /code]


    you have two errors in your process setup, plus some less-then-optimal operators choices:

    1. you are using an SVM with rbf kernel, that means in addition to C you have to optimize the gamma operators
    2. you are writing the model for each iteration of the parameter optimization. That means that in the end the file contains the model of the *last* parameter combination, not of the *best* one

    Those were the errors. Below some more hints:
    3. To optimize only one or two parameters in a well-defined range you should not use the evolutionary optimization, but grid optimization. Try values for both C and gamma between 1e-6 to 100 on a logarithmic scale.
    4. Read/Write Model is deprecated. You should not write to model to disk with those operators, but use Store and Retrieve to write/read the model to the repository.

    To work around point 2, leave away the Loop and Average operator, and use more iterations for the X-Validation instead. Connect its model output to the result (res) output of the optimization operator. That makes sure that in the end the best model will be returned.
    For reference have a look at the attached process.

    Best regards,
    Thanks for your suggestions.  I've updated my process based on your sample.  I still have one value as the result for all examples.

    label rownumber prediction(label) attrib1-attrib7
    0.02 100.0 -0.010000000000000002 -0.00445 0.3857 -0.00159 0.017248 0.017248 -1.531244 -0.073051
    -0.02 101.0 -0.010000000000000002 0.023435 0.8366 0.009283 0.021341 0.026825 -1.970992 -0.096819
    -0.01 102.0 -0.010000000000000002 0.001783 0.9629 -2.64E-4 0.010065 0.010065 -1.028535 0.046995
    -0.03 103.0 -0.010000000000000002 0.001799 0.9545 -0.006372 0.011877 0.011877 -0.482728 -0.041898
    0.0 104.0 -0.010000000000000002 0.017011 0.802 -0.026941 0.057796 0.057796 2.960633 0.168071

    There's also an error code from running this process:
    SEVERE: There is more than one renderable candidate for the result of com.rapidminer.operator.learner.functions.kernel.LibSVMModel

    This error also appears when I run your code.

    What else can I try to fix this? 

    the SVMLearner is executed before the Parameter Optimization. In this process setup, the connections alone do not define a unique process order, so you have to adjust it manually. To do so, enter process ordering mode by clicking the icon with the blue arrow and the question mark, and adjust the process order such that the optimization and the Parameter Setter are executed before the learner.

    Best regards,
    I updated the model with your suggestions but I still have the same output value for each entry.  I've created a simple process that trains a svm then uses the same data for forecasting.  It generates the same output value.

    I also created the same basic process in R and it generates output predictions close to the original values.

    The solution is something simple I'm sure but from searching the forum site for possible solutions and using the R parameters I still get the same value for all predictions in Rapidminer. 

    Any other ideas I can test?  Thanks

    ## copy data file rapidsvm.csv to c:\rapid
    ## read and prepare data
    custdata <- read.table( 'rapidsvm.csv', header=T, dec='.', sep = ',', na.strings=c('XXXXXXX'))
    rownames(custdata) <- custdata$rownbr
    custdata$custid <- NULL
    custdata$rownumber <- NULL
    ## find best svm parameters
    ## e1071.pdf, pages 49-55
    tunedsvm <- tune.svm(label~., data = custdata, type='eps-regression', gamma = 2^(4:6), cost = 2^(1:2))
    tund.gamma <- tunedsvm$best.parameters[[1]]
    tund.cost <- tunedsvm$best.parameters[[2]]
    ## use best parameters in model
    svmmodel <- svm(label ~ ., data = custdata, type='eps-regression', gamma = tund.gamma, cost = tund.cost)
    ## compare original to svm predicted value
    pred <- fitted(svmmodel)
    compare <- as.data.frame(custdata[1:20,'label'])
    colnames(compare) <- 'orig_value'
    compare$svm_pred  <-  round(pred[1:20], digits = 3)

    Hm, how did you optimize the SVM? From which process do the values for C=2 and gamma=63? Definitely not from the process you posted above! Those values are not tested in the Optimize Parameters operator configured there.
    Furthermore, in the above process you use only a 2-Fold cross validation to optimize the parameters. Try to increase the number of folds for better results!

    Best regards,