Data Mining PC and Benchmarking

MBA_Data_Miner
MBA_Data_Miner New Altair Community Member
edited November 2024 in Community Q&A
Howdy folks,


I am interested in what kind of hardware different users are running data mining software on. I have been benchmarking different hardware and am very interested in comparing notes with other forum members. Additionally, do you have any recommendations for data mining computer specs? Are you finding that higher end systems are substantially faster?
My results so far have a late model MacBook Pro as the fastest machine out of a few notebooks tested. I am also testing a desktop for comparison.

Please advise and comment,

Best regards, J.
Tagged:

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,

    We consultants of rapidminer have kind of standard Lenovo thinkpads. The only remarkable thing is that we have quite some memory attached. The thing is, that if you really need to do something with alot of CPU/RAM load you simply switch over to RM Server / cloud. I personally have basicly all of my processes located at servers.

    Cheers,
    Martin
  • MBA_Data_Miner
    MBA_Data_Miner New Altair Community Member
    That does help. Alas the probability of me having access to a server in the near future is low, so I will have to work with single systems. I have tested a couple different computers now, the MacBook actually beat out everything so far. Second place goes to a Asus G751 gaming laptop so far as well. The Asus has 24 GB ram, seems to help quite a bit.

    I have one last system to test, a circa 2012 custom gaming desktop. It has a great processor but less RAM than the Asus.

    The process being tested is a balanced and binned dataset of 5000 examples. Parameter optimization is run around a 10x cross validation of a decision tree. 5 Decision tree parameters are optimized (grid, parallel).
  • MBA_Data_Miner
    MBA_Data_Miner New Altair Community Member
    Just a thought- It would be amazing if Rapidminer created a simple benchmark program for users to run with a few different size datasets and algorithms to test. Think along the lines of CPU benchmarking programs, but specifically for data mining. It would be great to be able to test and upload results for different hardware/OS configurations and share them with people around the world. I use benchmarks published online a lot for comparing PC performance.

    Any comments or thoughts on this?
  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,

    honestly performance is not the point. Most users will have a server somewhere and use it if it goes CPU heavy.
    Regarding decision tree: Did you use RM version 6.3+? We optimize the Decision Tree in version 6.3.

    Your account is registered to a .edu address, are you areware of our academic programm? That gives you the opportunity to get a server license. A server can run on any machine running java (linux/mac/windows).

    Best,
    Martin
  • MBA_Data_Miner
    MBA_Data_Miner New Altair Community Member
    I wasn't sure if I could even use the academic program because of my role, as I am an employee/institutional analyst for a higher education institution (rather than a student or professor).  I've been using the free community edition (5.3) of studio.