"Questions abour RM DBScan arff in java source"

Islon
Islon New Altair Community Member
edited November 5 in Community Q&A
Hi guys, I'm a Weka User trying to learn how to use rapidminer clustering algorithms in a simple application java.

I want to do a simple process that reads a arff file and use the DBScan. After that, show the results in the console of netbeans.

My code:

package rapidminer;

import com.rapidminer.Process;
import com.rapidminer.RapidMiner;
import com.rapidminer.RapidMiner.ExecutionMode;
import com.rapidminer.operator.Operator;
import com.rapidminer.operator.OperatorCreationException;
import com.rapidminer.operator.OperatorException;
import com.rapidminer.operator.clustering.clusterer.DBScan;
import com.rapidminer.operator.io.ArffExampleSource;
import com.rapidminer.operator.ports.PortException;
import com.rapidminer.tools.OperatorService;


public class Rapidminer
{

   
    public static Process createProcess() {
try
        {
            // invoke init before using the OperatorService
            RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);
    RapidMiner.init();
}
        catch (Exception e)
        {
            e.printStackTrace(System.out);
        }

// create process
Process process = new Process();
try
        {
    // create operator
    Operator inputData = OperatorService.createOperator(ArffExampleSource.class);   
    // set parameters
    inputData.setParameter("data_file", "C:/iris.arff");
           
            // Clustering
            Operator scan = OperatorService.createOperator(DBScan.class);
            scan.setParameter("min_points", Double.toString(5));
            scan.setParameter("epsilon", Double.toString(0.1));
               
    // add operator to process
    process.getRootOperator().getSubprocess(0).addOperator(inputData);
            process.getRootOperator().getSubprocess(0).addOperator(scan);

            inputData.getOutputPorts().getPortByName("output").connectTo(scan.getInputPorts().getPortByName("training set"));
         

}
        catch (OperatorCreationException | PortException e)
        {
            e.printStackTrace(System.out);
        }
return process;
    }

   
    public static void main(String[] argv)
    {
// create process
Process process = createProcess();
// print process setup
System.out.println(process.getRootOperator().createProcessTree(0));

try
        {
    // perform process
    process.run();

}
        catch (OperatorException e)
        {
            e.printStackTrace(System.out);
        }
    }
}

Well, that's my questions:

1. How can I print the dbscan output in netbeans console?

2  When I executed the code, this error message was showed:

com.rapidminer.operator.UserError: No data was delivered at port Clustering.example set (disconnected).
at com.rapidminer.operator.ports.impl.AbstractPort.getData(AbstractPort.java:99)
at com.rapidminer.operator.clustering.clusterer.AbstractClusterer.doWork(AbstractClusterer.java:94)
at com.rapidminer.operator.Operator.execute(Operator.java:855)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:711)
at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
at com.rapidminer.operator.Operator.execute(Operator.java:855)
at com.rapidminer.Process.run(Process.java:949)
at com.rapidminer.Process.run(Process.java:873)
at com.rapidminer.Process.run(Process.java:832)
at com.rapidminer.Process.run(Process.java:827)
at com.rapidminer.Process.run(Process.java:817)
at rapidminer.Rapidminer.main(Rapidminer.java:75)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.javafx.main.Main.launchApp(Main.java:658)
at com.javafx.main.Main.main(Main.java:805)
Well, that's newbie questions, but that's my doubts.  ;D

Answers

  • Islon
    Islon New Altair Community Member
    More or Less I did the process in the rapidminer software:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.005">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.005" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="open_file" compatibility="5.3.005" expanded="true" height="60" name="Open File" width="90" x="45" y="30">
            <parameter key="filename" value="C:\iris.arff"/>
          </operator>
          <operator activated="true" class="read_arff" compatibility="5.3.005" expanded="true" height="60" name="Read ARFF" width="90" x="179" y="30">
            <parameter key="data_file" value="C:\iris.arff"/>
            <list key="data_set_meta_data_information"/>
          </operator>
          <operator activated="true" class="dbscan" compatibility="5.3.005" expanded="true" height="76" name="Clustering" width="90" x="313" y="30"/>
          <operator activated="true" class="print_to_console" compatibility="5.3.005" expanded="true" height="94" name="Print to Console" width="90" x="450" y="30">
            <parameter key="log_value" value="5"/>
          </operator>
          <connect from_op="Open File" from_port="file" to_op="Read ARFF" to_port="file"/>
          <connect from_op="Read ARFF" from_port="output" to_op="Clustering" to_port="example set"/>
          <connect from_op="Clustering" from_port="cluster model" to_op="Print to Console" to_port="through 1"/>
          <connect from_op="Clustering" from_port="clustered set" to_op="Print to Console" to_port="through 2"/>
          <connect from_op="Print to Console" from_port="through 1" to_port="result 1"/>
          <connect from_op="Print to Console" from_port="through 2" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
    I would like to take this idea to the java code.  :-\
  • aborg
    aborg New Altair Community Member
    Hello, it seems you have to connect to the ports. This thread gives an example how to do it.
    Most probably you will need to iterate through the exampleset, here is an example.
    Hope this helps, gabor
  • Islon
    Islon New Altair Community Member
    The operators of XML process (archive .rmp) I must to declare in java code too?

    In my case, I want to read a .arff and use it in a DBScan. After I want to see the results in console. How can I do this?
  • aborg
    aborg New Altair Community Member
    Ah, sorry, I had the impression you want to create the process on demand not load it. For loading, you can check this code.
    Cheers, gabor
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    is there any particular reason why you are trying to create the process by hand instead of using a RapidMiner process you created via the RapidMiner GUI as shown in the READ BEFORE POSTING: Frequently Asked Questions (Development) thread?

    Regards,
    Marco