RapidMiner/Netbeans Integration problem!

tonio6
tonio6 New Altair Community Member
edited November 5 in Community Q&A
Hello guys,
I have a serioous issue and i need your help...I've created a calss in my program which creates a rapidminer model by adding each operator of rapidminer directly in netbeans rather than loading a complete model from rapidminer program. But now i get an error which looks like that :

.....
....
Rmtest 0
Dec 08, 2013 10:25:00 PM youcomment.gui btRMtestActionPerformed
SEVERE: null
com.rapidminer.operator.OperatorCreationException: No operator description object given for 'com.rapidminer.operator.text.io.FileDocumentInputOperator'
at com.rapidminer.tools.OperatorService.createOperator(OperatorService.java:687)
at youcomment.RapidM.runthisthingy(RapidM.java:96)
at youcomment.gui.btRMtestActionPerformed(gui.java:217)
......
.....
.....

so actually the error appears after i print the "Rmtest 0" thing...in this line of code i suppose:
" FileDocumentInputOperator processdocumentfromfiles1 = OperatorService.createOperator(FileDocumentInputOperator.class)"

I add the whole class i made...pls help me out here i am not an expert java programmer but i am trying to learn!!! thank you a lot...


package youcomment;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.example.Attribute;
import com.rapidminer.example.Example;
import com.rapidminer.example.ExampleSet;
import com.rapidminer.example.table.ExampleTable;
import com.rapidminer.operator.ExecutionUnit;
import com.rapidminer.operator.IOContainer;
import com.rapidminer.operator.ModelApplier;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorCreationException;
import com.rapidminer.operator.OperatorException;
import com.rapidminer.operator.performance.PerformanceEvaluator;
import com.rapidminer.operator.validation.XValidation;
import com.rapidminer.operator.text.io.*;
import com.rapidminer.operator.learner.bayes.NaiveBayes;
import com.rapidminer.operator.performance.BinominalClassificationPerformanceEvaluator;
import com.rapidminer.tools.OperatorService;
import com.rapidminer.operator.text.io.DocumentLoader;
import com.rapidminer.operator.text.io.DocumentTextInputOperator;
import com.rapidminer.operator.text.io.tokenizer.StringTokenizerOperator;
import com.rapidminer.operator.text.io.transformer.CaseTransformationOperator;
import java.io.*;
import java.util.List;
import java.util.LinkedList;

public class RapidM {

    /**
    * Connect the output-port
    * <code>fromPortName</code> from Operator
    * <code>from</code> with the input-port
    * <code>toPortName</code> of Operator
    * <code>to</code>.
    */
    private static void connect(Operator from, String fromPortName, Operator to, String toPortName) {
        from.getOutputPorts().getPortByName(fromPortName).connectTo(to.getInputPorts().getPortByName(toPortName));
    }

    /**
    * Connect the output-port
    * <code>fromPortName</code> from Subprocess
    * <code>from</code> with the input-port
    * <code>toPortName</code> of Operator
    * <code>to</code>.
    */
    private static void connect(ExecutionUnit from, String fromPortName,
            Operator to, String toPortName) {
        from.getInnerSources().getPortByName(fromPortName).connectTo(
                to.getInputPorts().getPortByName(toPortName));
    }

    /**
    * Connect the output-port
    * <code>fromPortName</code> from Operator
    * <code>from</code> with the input-port
    * <code>toPortName</code> of Subprocess
    * <code>to</code>.
    */
    private static void connect(Operator from, String fromPortName,
            ExecutionUnit to, String toPortName) throws OperatorCreationException, IOException, OperatorException {

        from.getOutputPorts().getPortByName(fromPortName).connectTo(
                to.getInnerSinks().getPortByName(toPortName));}

        public void runthisthingy() throws OperatorCreationException, IOException, OperatorException{
       
        //initialiaze rapiminer
            // set properties to point to plugin directory
      String pluginDirString = new File("C:\\Program_Files\\Rapid-I\\RapidMiner5\\lib\\plugins").getAbsolutePath();
      System.setProperty(RapidMiner.PROPERTY_RAPIDMINER_INIT_PLUGINS_LOCATION, pluginDirString);
        RapidMiner.setExecutionMode(ExecutionMode.EMBEDDED_WITHOUT_UI);
        RapidMiner.init();

        // Create a process
        Process process = new Process();

        // Set the parameters of process
        process.getRootOperator().setParameter("parallelize_main_process", "true");
        process.getRootOperator().setParameter("encoding", "UTF-8");

        // all operators from "left to right"
System.out.println("Rmtest 0");
        //Process documents from files
        FileDocumentInputOperator processdocumentfromfiles1 = OperatorService.createOperator(FileDocumentInputOperator.class);
        System.out.println("Rmtest 1");
        //set textlist of Process documents from files
        List<String[]> textList1 = new LinkedList<String[]>();

        textList1.add(new String[]{"Positive", "positive"});
        textList1.add(new String[]{"Negative", "negative"});


        System.out.println("Rmtest 2");
        textList1.add(new String[]{"Positive", "positive"});
        System.out.println("Rmtest 3");
        textList1.add(new String[]{"Negative", "negative"});
            System.out.println("Rmtest 4");

        // Set the parameters of Process documents from files 1
        processdocumentfromfiles1.setListParameter("text_directories", textList1);
        processdocumentfromfiles1.setParameter("encoding", "UTF-8");
        processdocumentfromfiles1.setParameter("vector_creation", "TF-IDF");
        processdocumentfromfiles1.setParameter("prune_method", "absolute");
        processdocumentfromfiles1.setParameter("prune_below_absolute", "2");
        processdocumentfromfiles1.setParameter("prune_above_absolute", "999");
        processdocumentfromfiles1.setParameter("parallelize_vector_creation","true");
        //X-validation
        XValidation xvalidation = OperatorService.createOperator(XValidation.class);

        // Set the parameters of X-validation
        xvalidation.setParameter(XValidation.PARAMETER_NUMBER_OF_VALIDATIONS, Integer.valueOf(10).toString());
        xvalidation.setParameter("parallelize_training", "true");
        xvalidation.setParameter("parallelize_testing", "true");

        //Operators inside the validation
        NaiveBayes naivebayes = OperatorService.createOperator(NaiveBayes.class);
        //naivebayes.setParameter("training_cycles", "500");
        //RandomForestLearner ManolisRF=OperatorService.createOperator(RandomForestLearner.class);


        Operator modelApplier = OperatorService.createOperator(ModelApplier.class);
        BinominalClassificationPerformanceEvaluator performance = OperatorService.createOperator(BinominalClassificationPerformanceEvaluator.class);

        //Process documents from files
        FileDocumentInputOperator processdocumentfromfiles2 = OperatorService.createOperator(FileDocumentInputOperator.class);

        //set textlist of Process documents from files
        List<String[]> textList2 = new LinkedList<String[]>();
        textList2.add(new String[]{"Unlabeled", "unlabeled"});
       

        // Set the parameters of Process documents from files 2
        processdocumentfromfiles2.setListParameter("text_directories", textList2);
        processdocumentfromfiles2.setParameter("encoding", "UTF-8");
        processdocumentfromfiles2.setParameter("vector_creation", "TF-IDF");
        processdocumentfromfiles2.setParameter("prune_method", "absolute");
        processdocumentfromfiles2.setParameter("prune_below_absolute", "2");
        processdocumentfromfiles2.setParameter("prune_above_absolute", "999");
        processdocumentfromfiles2.setParameter("parallelize_vector_creation","true");
       
        // add operators to the main process and connect them
        process.getRootOperator().getSubprocess(0).addOperator(processdocumentfromfiles1);
        process.getRootOperator().getSubprocess(0).addOperator(xvalidation);
        connect(processdocumentfromfiles1, "example set", xvalidation, "training");

        //operators  inside the process documents from files 1
        StringTokenizerOperator tokenize1 = OperatorService.createOperator(StringTokenizerOperator.class);
        CaseTransformationOperator tranformcases1 = OperatorService.createOperator(CaseTransformationOperator.class);
        // add operators to the process documents from files and connect them
        processdocumentfromfiles1.getSubprocess(0).addOperator(tokenize1);
        processdocumentfromfiles1.getSubprocess(0).addOperator(tranformcases1);

        connect(processdocumentfromfiles1.getSubprocess(0), "document", tokenize1, "document");
        connect(tokenize1, "document", tranformcases1, "document");
        connect(tranformcases1, "document", processdocumentfromfiles1.getSubprocess(0), "document 1");

        // xvalidation
        // training part of xvalidation
        xvalidation.getSubprocess(0).addOperator(naivebayes);
        //xvalidation.getSubprocess(0).addOperator(ManolisRF);

        // create connection within training process: from left to right ...
        connect(xvalidation.getSubprocess(0), "training", naivebayes, "training set");

        //connect(xvalidation.getSubprocess(0), "training", ManolisRF,"training set");
        connect(naivebayes, "model", xvalidation.getSubprocess(0), "model");
        //connect(ManolisRF, "model", modelWriter, "input");

        // testing part of xvalidation
        xvalidation.getSubprocess(1).addOperator(modelApplier);
        xvalidation.getSubprocess(1).addOperator(performance);

        // create connection within testing process: from left to right ...
        connect(xvalidation.getSubprocess(1), "model", modelApplier, "model");
        connect(xvalidation.getSubprocess(1), "test set", modelApplier, "unlabelled data");
        connect(modelApplier, "labelled data", performance, "labelled data");
        connect(performance, "performance", xvalidation.getSubprocess(1), "averagable 1");

      // add operators to the main process and connect them
        process.getRootOperator().getSubprocess(0).addOperator(processdocumentfromfiles2);
        connect(processdocumentfromfiles1, "word list", processdocumentfromfiles2, "word list");

        //operators  inside the process documents from files
        StringTokenizerOperator tokenize2 = OperatorService.createOperator(StringTokenizerOperator.class);
        CaseTransformationOperator tranformcases2 = OperatorService.createOperator(CaseTransformationOperator.class);
        // add operators to the process documents from files and connect them
        processdocumentfromfiles2.getSubprocess(0).addOperator(tokenize2);
        processdocumentfromfiles2.getSubprocess(0).addOperator(tranformcases2);

        connect(processdocumentfromfiles2.getSubprocess(0), "document", tokenize2, "document");
        connect(tokenize2, "document", tranformcases2, "document");
        connect(tranformcases2, "document", processdocumentfromfiles2.getSubprocess(0), "document 1");

        // add operators to the main process and connect them
       
        process.getRootOperator().getSubprocess(0).addOperator(modelApplier);
       
        connect(processdocumentfromfiles2, "example set", modelApplier, "unlabelled data");
        connect(xvalidation, "model", modelApplier, "model");
        connect(modelApplier, "labelled data", process.getRootOperator().getSubprocess(0), "result 1");
        connect(modelApplier, "model", process.getRootOperator().getSubprocess(0), "result 2");

        // print process setup
        //System.out.println(process.getRootOperator().createProcessTree(0));

        File x = new File("Final.rmp");

        process.save(x);

        // perform process
        //process.run();
        IOContainer ioResult = process.run();
       
    }
}

Answers

  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    please have a look here: http://rapid-i.com/rapidforum/index.php/topic,5807
    Question 5 is probably the most interesting for you. We highly discourage creating your own processes programmatically, as it is very error-prone and quite frankly tedious.

    Regards,
    Marco
  • shine
    shine New Altair Community Member
    Hi buddy,

    I guess the error might probably caused by this:

    "String pluginDirString = new File("C:\\Program_Files\\Rapid-I\\RapidMiner5\\lib\\plugins").getAbsolutePath();"

    As you specific your pluginDir, you should make sure the  "C:\\Program_Files\\Rapid-I\\RapidMiner5\\lib\\plugins" contains the rapidminer-Text-Procssing.X.XXX.jar with right version for your project.

    Otherwise, even you add  the text-procssing.jar to the \lib of your project in netbeans to "solve" the error indicated by NetBeans, there is not guarantee that the  rapidminer-Text-Procssing.X.XXX.jar (the real on for searching classes) in your specified pluginDir works.

    I have test your code in my IDE and it didn't show this error. However, there are other errors following you should fix...
    1. You should create 2 modelApplier, one for the subprogress in X-Validation and another one for he process.
    2. Create the "set role" operator to set the label attribute...

    Hope that helps.

    Best,
    Shine
  • tonio6
    tonio6 New Altair Community Member
    Thank you all for your help...It turns out  there were many errors in the previous applicaiton and i kept fixing things but more would come up and i couldn't even make the process to run...Finally i decided to follow the 1st advice and load a ready model...It seems to be running fine except when i get an error at this line of code:

    IOContainer ioResult = process.run();
    ExampleSet resultSet = (ExampleSet) ioResult.getElementAt(0);
    ExampleTable mytable = resultSet.getExampleTable();

    The error is this one...

    com.rapidminer.Process run
    INFO: Process C:\Users\Antonis\.RapidMiner5\repositories\Local Repository\yo.rmp finished successfully after 0 s
    Exception in thread "AWT-EventQueue-0" java.lang.ClassCastException: com.rapidminer.operator.learner.bayes.SimpleDistributionModel cannot be cast to com.rapidminer.example.ExampleSet

    I don't know how to fix this...please help me again...I am trying to make it work and i feel that i am sooo close...!!! thank you in advance for your time!!! :D
  • aborg
    aborg New Altair Community Member
    You have a SimpleDistributionModel as the result port 1, although you were expecting an ExampleSet. Seems to me you connected the wrong port there, or you really need to handle the SimpleDistributionModel case and cast to that.
    Cheers, gabor
  • tonio6
    tonio6 New Altair Community Member
    I've tried several connections but i still get the same error...Can you propose a specific connection for me to try?I think it's something very simple that i am missing...Here is my rapidminer xml:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.015">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.015" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="text:process_document_from_file" compatibility="5.3.002" expanded="true" height="76" name="Process Documents from Files" width="90" x="45" y="75">
            <list key="text_directories">
              <parameter key="Positive" value="C:\Users\Antonis\Desktop\netbeans\tue-bis-overig-ir\youComment\positive"/>
              <parameter key="Negative" value="C:\Users\Antonis\Desktop\netbeans\tue-bis-overig-ir\youComment\negative"/>
            </list>
            <process expanded="true">
              <operator activated="true" class="text:tokenize" compatibility="5.3.002" expanded="true" height="60" name="Tokenize" width="90" x="124" y="126"/>
              <operator activated="true" class="text:filter_stopwords_english" compatibility="5.3.002" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="312" y="124"/>
              <operator activated="true" class="text:transform_cases" compatibility="5.3.002" expanded="true" height="60" name="Transform Cases" width="90" x="581" y="120"/>
              <connect from_port="document" to_op="Tokenize" to_port="document"/>
              <connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
              <connect from_op="Filter Stopwords (English)" from_port="document" to_op="Transform Cases" to_port="document"/>
              <connect from_op="Transform Cases" from_port="document" to_port="document 1"/>
              <portSpacing port="source_document" spacing="0"/>
              <portSpacing port="sink_document 1" spacing="0"/>
              <portSpacing port="sink_document 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="text:process_document_from_file" compatibility="5.3.002" expanded="true" height="76" name="Process Documents from Files (2)" width="90" x="179" y="165">
            <list key="text_directories">
              <parameter key="Unlabeled" value="C:\Users\Antonis\Desktop\netbeans\tue-bis-overig-ir\youComment\unlabeled"/>
            </list>
            <process expanded="true">
              <operator activated="true" class="text:tokenize" compatibility="5.3.002" expanded="true" name="Tokenize (2)"/>
              <operator activated="true" class="text:filter_stopwords_english" compatibility="5.3.002" expanded="true" name="Filter Stopwords (2)"/>
              <operator activated="true" class="text:transform_cases" compatibility="5.3.002" expanded="true" name="Transform Cases (2)"/>
              <connect from_port="document" to_op="Tokenize (2)" to_port="document"/>
              <connect from_op="Tokenize (2)" from_port="document" to_op="Filter Stopwords (2)" to_port="document"/>
              <connect from_op="Filter Stopwords (2)" from_port="document" to_op="Transform Cases (2)" to_port="document"/>
              <connect from_op="Transform Cases (2)" from_port="document" to_port="document 1"/>
              <portSpacing port="source_document" spacing="0"/>
              <portSpacing port="sink_document 1" spacing="0"/>
              <portSpacing port="sink_document 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="set_role" compatibility="5.3.015" expanded="true" height="76" name="Set Role" width="90" x="380" y="210">
            <parameter key="attribute_name" value="label"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="x_validation" compatibility="5.1.002" expanded="true" height="112" name="Validation" width="90" x="246" y="30">
            <description>A cross-validation evaluating a decision tree model.</description>
            <process expanded="true">
              <operator activated="true" class="naive_bayes" compatibility="5.3.015" expanded="true" height="76" name="Naive Bayes" width="90" x="45" y="30"/>
              <connect from_port="training" to_op="Naive Bayes" to_port="training set"/>
              <connect from_op="Naive Bayes" from_port="model" to_port="model"/>
              <connect from_op="Naive Bayes" from_port="exampleSet" to_port="through 1"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
              <portSpacing port="sink_through 2" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.3.015" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="through 1" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="source_through 2" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.3.015" expanded="true" height="76" name="Apply Model (2)" width="90" x="514" y="120">
            <list key="application_parameters"/>
          </operator>
          <connect from_port="input 1" to_op="Process Documents from Files" to_port="word list"/>
          <connect from_op="Process Documents from Files" from_port="example set" to_op="Validation" to_port="training"/>
          <connect from_op="Process Documents from Files" from_port="word list" to_op="Process Documents from Files (2)" to_port="word list"/>
          <connect from_op="Process Documents from Files (2)" from_port="example set" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Process Documents from Files (2)" from_port="word list" to_port="result 4"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
          <connect from_op="Validation" from_port="model" to_op="Apply Model (2)" to_port="model"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
          <connect from_op="Apply Model (2)" from_port="labelled data" to_port="result 3"/>
          <connect from_op="Apply Model (2)" from_port="model" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="source_input 2" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
          <portSpacing port="sink_result 4" spacing="0"/>
          <portSpacing port="sink_result 5" spacing="0"/>
        </process>
      </operator>
    </process>


    thank you again!
  • aborg
    aborg New Altair Community Member
    Hello,

      It is possible that you have connected the good ports to the good output, but in that case you should cast to SimpleDistributionModel and retrieve the parameters using the getRawDistributionParameter() method for example.
      If you are interested in the labelled data, you should rewire the process, or select the 3rd (2 in the array) element from the result, that is probably an ExampleSet.
    Hope this helps, gabor
  • tonio6
    tonio6 New Altair Community Member
    What I have now is this:

    ExampleSet resultSet = (ExampleSet) ioResult.getElementAt(0);
    ExampleTable mytable = resultSet.getExampleTable();
                    for (int i = 0; i < mytable.size() - 1; i++) {
                      Example example = resultSet.getExample(i);
                        Attribute predict = example.getAttributes().get("prediction(label)");//we ony get the prediction column
                        String resultString = example.getValueAsString(predict);
                        System.out.println("\nPrediction is : " + resultString + "\n");//we show the result
                    }

    but It doesn't work for simple distribution model....How should i change this to make it work?
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    tonio6, please do not post your question several times, it will not speed up answering times. See this thread for an answer: http://rapid-i.com/rapidforum/index.php/topic,2775.0.html

    Regards,
    Marco