Creating a new clustering algorithm with java and rapidminer

imfaith
imfaith New Altair Community Member
edited November 5 in Community Q&A
Hi, I'm new to RapidMiner. I have a project that consists in modeling a new clustering algorithm using the Java programming language. For example, my model starts by applying K-means in the first time and then add other techniques. In this case, I don't know what I should do. I researched on the net and I don't understand: do I  must  integrate RapidMiner in my java code then use the algorithms implemented in rapidminer even be able to see the representation as in RapidMiner (in my case clustering the data) to compare my algorithm with existing algorithms, or I must to model this new model in RapidMiner then add it to my java application.
I have integrated RapidMiner in Eclipse like is described in : http://rapid-i.com/content/view/25/48/lang,en/.
I didn't understand the utility of this, can I use for example the K-means algorithm from rapidMiner and recuperate the results in my java code ?
Things are mixed in my head. Can you help me to know how to start.
best regards.

Answers

  • Marc
    Marc New Altair Community Member
    Hello,
    If I could reply I would say that I had the same problem when I started. Fortunatelly, whole rapidminer is written in java, so you can integrate it to your application by using its source code. The easiest way is to create the process in rapidminer, run it in your application and let the results print to the console. Everything you do in rapidminer GUI you can do in eclipse.
    http://rapid-i.com/wiki/index.php?title=Integrating_RapidMiner_into_your_application this page may help with basics understanding.
    Did it help you a bit?
  • imfaith
    imfaith New Altair Community Member
    Hi,
    Thank you for the answer. I integrated RapidMiner in my Eclipse. My problem is that I could not find the starting point. I must create a new clustering algorithm. At first, I want to apply k-means directly into RapidMiner (since I can now start RapidMiner from eclipse) and then retrieve the results (clusters and their contents) in my java code to continue and apply other technical programming, but I don't know if it is feasible and how to do it? Since I am also beginner in java. I documented but I found nothing; there is a lack of documentation in RapidMiner.
    Thank you
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    I would suggest what I always suggest ;)
    Create the process(es) you need in RapidMiner GUI, and then just execute the process on your data via java and then continue to work with the results. To see how this is done, see for example here.
    If you want to create your own operator, have a look at the existing operators (check OperatorsCore.xml file to see the classes behind the the RM GUI operators) and go from there.

    Regards,
    Marco
  • imfaith
    imfaith New Altair Community Member
    Thank you for the clear answer. I wrote this program but I encountered some problems. I haven't understand because I am a beginner in java:

    import com.rapidminer.RapidMiner;
    import com.rapidminer.RapidMinerCommandLine;
    import com.rapidminer.example.ExampleSet;
    import com.rapidminer.operator.ExecutionMode;
    import com.rapidminer.operator.IOContainer;
    import com.rapidminer.operator.IOObject;
    import com.rapidminer.repository.IOObjectEntry;
    import com.rapidminer.repository.ProcessEntry;
    import com.rapidminer.repository.RepositoryLocation;

    public class model {

    public static void main(String args[]) throws Exception {

    // this initializes RapidMiner with your repositories available

    RapidMiner.setExecutionMode(ExecutionMode.COMMAND_LINE);

    RapidMiner.init();
    // loads the process from the repository
    RepositoryLocation pLoc = new RepositoryLocation("//C:/Users/faith/Desktop/MyRepository/MyData/kmeansProcess");
    ProcessEntry pEntry = (ProcessEntry) pLoc.locateEntry();
    String processXML = pEntry.retrieveXML();
    Process myProcess = new Process(processXML);
    // if need be, you can give the process IOObjects as parameter (this would be the case if you used the process input ports)
    RepositoryLocation loc = new RepositoryLocation("//C:/Users/faith/Desktop/MyRepository/MyData/cars");
    IOObjectEntry entry = (IOObjectEntry) loc.locateEntry();
    IOObject myIOObject= entry.retrieveData(null);

    // execute the process and get the resulting objects
    IOContainer ioInput = new IOContainer(new IOObject[] {myIOObject});
    // just use myProcess.run() if you don't use the input ports for your process
    IOContainer ioResult = myProcess.run(ioInput);

    // use the result(s) as needed, for example if your process just returns one ExampleSet, use this:
    if (ioResult.getElementAt(0) instanceof ExampleSet) {
    ExampleSet resultSet = (ExampleSet)ioResult.getElementAt(0);
    }
      }
    }



    Exception in thread "main" java.lang.Error: Unresolved compilation problems:
    COMMAND_LINE cannot be resolved or is not a field
    Cannot instantiate the type Process
    The method run(IOContainer) is undefined for the type Process

    at model.main(model.java:23)

    Thank you
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    do you use an IDE? Please consider using for example Eclipse and then press Ctrl+Shift+o while in your java class. That will handle the imports. You are missing required imports in your class.

    Regards,
    Marco
  • imfaith
    imfaith New Altair Community Member
    Hi;
    I use Eclipse. No imports is introduced, it does not ask for missing imports. It shows me three errors:
    1) He does not know command_line "command_line cannot be resolved or is not a field" in this instruction:
    RapidMiner.setExecutionMode (ExecutionMode.COMMAND_LINE);
    when I saw the possible methods  proposed by Eclipse, it isn't COMMAND_LINE ? what it means here ?
    2) Can not instantiate the Process type from this statement:
    String processXML = pEntry.retrieveXML();
    Process myProcess = new Process(processXML);
    3) The method run (IOContainer) is undefined for the Process type in this statement:
    IOContainer ioResult = myProcess.run(ioInput);
    .
    when I saw the possible methods for myProcess I didn't find the run () method.

    I could not solve these three problems.

    Best Regards
  • imfaith
    imfaith New Altair Community Member
    Hi;
    I solved the first problem by this statement:
    RapidMiner.setExecutionMode(com.rapidminer.RapidMiner.ExecutionMode.COMMAND_LINE);
    But the two last problems, not yet.

    Thanks
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    as stated in the Development FAQ, integrating RM is not recommended if you're a java beginner.
    The errors show your imports are messed up.

    Regards,
    Marco
  • imfaith
    imfaith New Altair Community Member
    Hi,
    Thank you , I have corrected the errors. But when I run the program, it shows me this error message:

    8 nov. 2012 10:52:43 com.rapidminer.tools.ParameterService init
    INFO: Reading configuration resource com/rapidminer/resources/rapidminerrc.
    8 nov. 2012 10:52:44 com.rapidminer.tools.I18N <clinit>
    INFO: Set locale to en.
    8 nov. 2012 10:52:44 com.rapid_i.Launcher ensureRapidMinerHomeSet
    INFO: Property rapidminer.home is not set. Guessing.
    8 nov. 2012 10:52:44 com.rapid_i.Launcher ensureRapidMinerHomeSet
    INFO: Trying parent directory of 'C:\Program Files\Rapid-I\RapidMiner5\lib\launcher.jar'...gotcha!
    8 nov. 2012 10:52:44 com.rapid_i.Launcher ensureRapidMinerHomeSet
    INFO: Trying parent directory of 'C:\Program Files\Rapid-I\RapidMiner5\lib\rapidminer.jar'...gotcha!
    8 nov. 2012 10:52:56 com.rapidminer.parameter.ParameterTypePassword decryptPassword
    WARNING: Password in XML file looks like unencrypted plain text.
    8 nov. 2012 10:53:04 com.rapidminer.tools.plugin.Plugin registerOperators
    INFO: No operator descriptor specified for plugin Community. Trying plugin initializtation class com.rapidminer.community.CommunityPluginInit.
    8 nov. 2012 10:53:04 com.rapidminer.tools.plugin.Plugin registerOperators
    WARNING: No operator descriptor defined for: Community
    8 nov. 2012 10:53:06 com.rapidminer.tools.jdbc.JDBCProperties <init>
    WARNING: Missing database driver class name for ODBC Bridge (e.g. Access)
    8 nov. 2012 10:53:06 com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
    INFO: JDBC driver ca.ingres.jdbc.IngresDriver not found. Probably the driver is not installed.
    8 nov. 2012 10:53:06 com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
    INFO: JDBC driver oracle.jdbc.driver.OracleDriver not found. Probably the driver is not installed.
    Exception in thread "main" com.rapidminer.repository.RepositoryException: Requested repository C: does not exist.
    at com.rapidminer.repository.RepositoryManager.getRepository(RepositoryManager.java:202)
    at com.rapidminer.repository.RepositoryLocation.getRepository(RepositoryLocation.java:144)
    at com.rapidminer.repository.RepositoryLocation.locateEntry(RepositoryLocation.java:167)
    at Model.main(Model.java:25)
    Can you help me ?
  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    please read the available documentation, especially the manual and the How to extend RapidMiner whitepaper, before proceeding. You are try to access a repository which does not exist.

    Regards,
    Marco