🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Why does Rapid Miner Studio reduce the number of rows in the model results

User: "jsdrew"
New Altair Community Member
Updated by Jocelyn
I am using Rapid Miner Studio for the first time.  I've loaded a dataset and done an automodel.  But the exported results only have about 11,000 rows while the dataset has 29,000 rows.  How do I get it to give me predictions for all rows?

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "BalazsBaranyRM"
    New Altair Community Member
    Accepted Answer
    Hi @jsdrew,

    a basic principle of predictive modeling is that you shouldn't use the model that was built on a record to predict the outcome of that same record. This would favor overfitted models.

    Therefore, AutoModel does a "split validation". It takes about 2/3 of the data for building the model and the rest for evaluating the model by comparing the known label to the predicted one. 

    If you take the process created by AutoModel and replace the split validation with a cross validation, the process will take longer (which is why AutoModel doesn't use it), as it is building 10 or 11 models. However, in this case you will get a prediction for every row in your data.

    The Academy has videos for these topics if you need more information.

    Regards,
    Balázs