Optimize Selection Evolutionary, Parallel -Scaling

hughesfleming68
hughesfleming68 New Altair Community Member
edited November 2024 in Community Q&A

Has anyone done any tests to determine how well Rapidminer 7 scales on multicore cpu's? Partiularly machines with 16 threads or greater?

 

Many thanks,

 

Alex

Tagged:

Answers

  • JEdward
    JEdward New Altair Community Member

    Hi Alex,

     

    One of the best experts for this would be @land as his company developed an extension specifically for parellizing efficiently in RapidMiner. 

    I think you spoke with him on the forum recently.  I'm sure he has some good information in that area. 

     

    Regards,

    John.

  • hughesfleming68
    hughesfleming68 New Altair Community Member

    Thanks John,

     

    I will ask him. I will also try and setup a test myself over the next couple of weeks.

     

    regards,

     

    Alex

  • land
    land New Altair Community Member
    Hi,

    in principle you need to consider that each thread needs a copy of the data, so your memory should match your CPU count.
    The easiest way is to use multiple threads for the cross validation, this directly results in nearly x-times speed up.
    However, as one usually uses a 10 fold cross validation (I make it usually 8 to match my cpu cores) this speedup is limited. If you need to utilize more threads, you also need to run outer operators in parallel.
    I usually find myself to avoid this and rather have multiple processes running in parallel. One usually does not only use ONE single optimization run, but have multiple for multiple methods. This way you can easily bring down also bigger servers.
    And of course real world projects usually not just need one model but usually multiple ones. So you can also loop over groups of data and calculate their models in parallel.

    We offered the Jackhammer Extension until recently that did add a lot of the necessary functionality.

    Greetings,
    Sebastian