"Studio hanging on a large dataset with 3,000,000 rows"

MarlaBot
MarlaBot New Altair Community Member
edited November 5 in Community Q&A
A RapidMiner user wants to know the answer to this question: I am trying to build a model from the 311 Explorer data here https://connect.edmonton.ca/#!/view-data. My model gets stuck at about 5% and I think Studio may be crashing. Any ideas?
Tagged:

Answers

  • twentworth
    twentworth New Altair Community Member
    I tried the data in Auto Model and it seems to be hanging for me as well. 
  • varunm1
    varunm1 New Altair Community Member
    @twentworth I faced this issue earlier. can you check RAM usage of your PC by RM. Do you see its using all the memory?

    Thanks
    varun
  • twentworth
    twentworth New Altair Community Member
    @varunm1 CPU is pegged but memory is fine. I'm on a Mac. 
  • varunm1
    varunm1 New Altair Community Member
    @twentworth Can you tell what you are trying to do in automodel. This works fine for me in windows. Might be a specific erro.
  • twentworth
    twentworth New Altair Community Member
    This error is from @taghaddo, maybe he can chime in?
  • taghaddo
    taghaddo New Altair Community Member
    I am trying to using prediction model , but it hangng in 13% progress of KNN
    and 5% of linear reg

  • taghaddo
    taghaddo New Altair Community Member
    it is using full CPU 
  • varunm1
    varunm1 New Altair Community Member
    @taghaddo Can you post your XML code here? To copy XML code, you should go to View --> Show Panel --> XML. Then copy whole code and paste it here. Also, I just want to confirm the size of the dataset, I downloaded it and it shows 360K samples and not 3 Million am I correct?
  • taghaddo
    taghaddo New Altair Community Member
    yes, 3K.
  • sgenzer
    sgenzer
    Altair Employee
    edited February 2019
    So this is a 100MB csv file so I would expect a standard desktop/laptop to bog down here. And which feature are you trying to predict? I tried to predict "Service Code" (why not?) using Auto Model. Only Naive Bayes is recommended as the others are too resource-intensive.

    @taghaddo can you please post your XML so we can see your process?

    Scott

    [EDIT: the runtime for even NB will be a long time in RM 9.1 - but very fast in RM 9.2 :wink: ]