Predict missing values
Hello all,
I have a dataset with about 3000 records of rated songs. About half are rated, the other half is not. I'm trying to build a model that predicts the empty ratings based on what users rated. I have done the following:
My question is, is this correct? Do I need to make adjustments to make it more correct? Because when I for example already change the k I get different values. And another question: how do I show only the values that have been predicted instead of a full overview, including the already filled in values.
Thanks in advance!


I have a dataset with about 3000 records of rated songs. About half are rated, the other half is not. I'm trying to build a model that predicts the empty ratings based on what users rated. I have done the following:
My question is, is this correct? Do I need to make adjustments to make it more correct? Because when I for example already change the k I get different values. And another question: how do I show only the values that have been predicted instead of a full overview, including the already filled in values.
Thanks in advance!


Find more posts tagged with
Sort by:
1 - 5 of
51
@hatsjikidee,
OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.
From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.
Hope this helps,
Regards,
Lionel
PS : There is a useful ressource (a book) for you :
- "RapidMiner, Data mining use cases and business analytics applications", (Chapter 9 : Constructing Recommender Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
- the associated extension "Recommenders" (to install from the MarketPlace).
OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.
From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.
Hope this helps,
Regards,
Lionel
PS : There is a useful ressource (a book) for you :
- "RapidMiner, Data mining use cases and business analytics applications", (Chapter 9 : Constructing Recommender Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
- the associated extension "Recommenders" (to install from the MarketPlace).
@hatsjikidee as @lionelderkrikor said you need to add more data to the one you already have and then you can predict the rate that the user is going to give. Also read something about how Netflix algorithm works. You may also add some complexity to the analysis by obtaining the lyrics and doing some text minning to obtain the words that are more repeated on the songs and how the presence of them may or may not impact de rating of the user.
I don't know if you are doing this as part of a class or just for the fun of doing it but in real life part of being a Data Scientist is analysing the problem, identify the data that may or may not predict an outcome and then extract it from it source. Sometimes the Example Set includes all the attributes you may need and sometimes you need to go out to the internet and find it to enhance your analysis.
Hope this helps and if you need help texts us and we'll be glad to guide you on the process.
Best Regards.
I don't know if you are doing this as part of a class or just for the fun of doing it but in real life part of being a Data Scientist is analysing the problem, identify the data that may or may not predict an outcome and then extract it from it source. Sometimes the Example Set includes all the attributes you may need and sometimes you need to go out to the internet and find it to enhance your analysis.
Hope this helps and if you need help texts us and we'll be glad to guide you on the process.
Best Regards.
Sort by:
1 - 1 of
11
@hatsjikidee,
OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.
From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.
Hope this helps,
Regards,
Lionel
PS : There is a useful ressource (a book) for you :
- "RapidMiner, Data mining use cases and business analytics applications", (Chapter 9 : Constructing Recommender Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
- the associated extension "Recommenders" (to install from the MarketPlace).
OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.
From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.
Hope this helps,
Regards,
Lionel
PS : There is a useful ressource (a book) for you :
- "RapidMiner, Data mining use cases and business analytics applications", (Chapter 9 : Constructing Recommender Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
- the associated extension "Recommenders" (to install from the MarketPlace).
If you have some descriptive features of your songs, you can build a model based on your labeled data (your rated songs) and then apply this model to the unlabelled data (the unrated songs).
To help you further can you share your data ?
Hope this helps,
Regards,
Lionel