bulk scoring in rapidminer server

User: "Neel"
New Altair Community Member
Updated by Jocelyn
Hi everyone,

What is the best way to bulk score new records (100s of thousands originating from enterprise DB) using a deployed model (deployed via Deployment) in the Rapidminer server? 

>I have tried using the web service, but it does not scale. The response time for a single record is around 3 seconds currently.
> There's no 'real-time' scoring requirement. It is a daily single bulk request. 

Sort by:
1 - 1 of 11
    User: "IngoRM"
    New Altair Community Member
    Accepted Answer
    Hi,
    With the upcoming RM 9.6 version you can turn off explanations for predictions which slows down the scoring a lot.  But for true bulk scoring a single row web service approach does not seem to be great anyway IMHO.
    If you check the repository folder of the deployed models, you will find a process called "score_set" which you can use as a blueprint.  Make a copy of this and adapt it a bit (especially for the operator "Explain Prediction" turn on the parameter "only predictions" to speed things up!) and add a data source (reading from you DB) in the beginning.  If you also want to add the monitoring, you may also want to add the operator MDMLogging to this (which is a bit more complicated - I suggest to deal with this last if everything else works and you want the logging...).
    Hope this helps,
    Ingo