"how to develop a new algorithm in RapidMiner?"

User: "Obaeissa"
New Altair Community Member
Updated by Jocelyn
I have an idea of a new algorithm I want to develop it and tested in RapidMiner. Should I use the extension template provided by RapidMiner or there is another way?

Find more posts tagged with

Sort by:
1 - 2 of 21
    User: "rfuentealba"
    New Altair Community Member
    Accepted Answer
    Hello, @Obaeissa, welcome to the community.

    The RapidMiner extension template is provided to you so that you don't have to connect with RapidMiner and import stuff from there. If you are proficient in Java, it is the most recommended way to implement your algorithm. You can also use the Apache Groovy programming language to implement it and run it as "Execute Script". However, I haven't seen much documentation about this (perhaps my good friends @mschmitz, @David_A and @land can give you some more tricks. Perhaps @IngoRM too).

    If your idea of an algorithm is something you are trying for the first time, I would recommend to create a Python (or whatever language you feel comfortable with) implementation first, and then build a RapidMiner operator (or superoperator) based on that. At least that is what I did when I "invented" the Naïve Bayes algorithm (Yes, I did it almost 200 years after Sir Thomas Bayes, but I didn't know it until I saw my first data science books, so... sorry). If you go this route, make sure you use the Anaconda Python distribution and the Python Scripting extension, so it can be easier to test it through RapidMiner.

    BTW, write a paper about your algorithm. It is important to keep things as scientific as possible, not because it is a RapidMiner requirement but because data scientists like academic processes. Yes, you will hear @yyhuang saying that "a lot of academic data scientists haven't seen problems in real life", but creating an algorithm (rather than making use of it) is a totally different matter.

    Hope this helps,

    Rodrigo.

    User: "IngoRM"
    New Altair Community Member
    Accepted Answer
    Updated by IngoRM
    To add to Rodrigo's comment: I would definitely recommend to always work with the language you are already comfortable with.  If you know Java, there is simply no point in learning Python first but going straight to building a Java extension is most likely the simplest way for you.  But if you already know R or Python or even have an implementation there already, the first thing should always be to integrate those first.  Just like Rodrigo has said.
    So let's assume you in fact do know Java and want to go down the extension route.  Then please use this documentation here:
    If you are familiar with Java, Git, Gradle and you favorite IDE (IntelliJ, Eclipse) already, you should be able to be up and running in less than an hour...
    On the freelancing: while I would certainly be able to code this for you, I have some doubts that you would be willing to pay my daily rate for that ;) - so I hope that somebody else would step in here to help out in case you need it.
    Hope this helps,
    Ingo