Re:-turbo model and outlier removal

guptasha
guptasha New Altair Community Member
edited November 5 in Altair RapidMiner
I am analyzing wine reviews dataset from kaggle in the rapid miner. Please search the dataset from google.
1-I am not able to use turbo model in this dataset ? My laptop got hanged ? any solution how I can run 150k dataset successfully?
2-how to remove outlier in the price column?any suggestion?

Comments

  • sgenzer
    sgenzer
    Altair Employee
    hello @guptasha - it is good to have you here on the community. Let me try to help you step-by-step...
    Please search the dataset from google.
    So generally here on the community you want to make our lives easy to help you. Asking to google data sets is not likely to get answers from generous people. :smile:

    And by the way - the wine data set is of course already built into RapidMiner. Just type the word "wine" into the global search bar:



    I am not able to use turbo model in this dataset ?
    So I'm a little confused. We have "Turbo Prep" and "Auto Model" - which one are you referring to?

    My laptop got hanged ? 
    It's possible for sure - especially if you have a small laptop and a large data set. Have you looked at our System Requirements for RapidMiner Studio?

    any solution how I can run 150k dataset successfully?
    So I run 150k data sets every day successfully. If your laptop is hanging, most likely your computer is either not in spec or close to it. Increasing your RAM and CPU cores can help a lot.

    how to remove outlier in the price column?any suggestion?
    So this is hard to answer when you have not provided (a) your XML, and (b) your data set. Perhaps you overlooked these instructions when you posted your question?



    Scott
  • Telcontar120
    Telcontar120 New Altair Community Member
    Also, you can always try Sample as an initial workaround.  You don't really need 150k records to build a preliminary scorecard.  A 10% or 20% sample should be perfectly adequate to get you going...