Predict missing values

hatsjikidee
hatsjikidee New Altair Community Member
edited November 5 in Community Q&A
Hello all,

I have a dataset with about 3000 records of rated songs. About half are rated, the other half is not. I'm trying to build a model that predicts the empty ratings based on what users rated. I have done the following:

My question is, is this correct? Do I need to make adjustments to make it more correct? Because when I for example already change the k I get different values. And another question: how do I show only the values that have been predicted instead of a full overview, including the already filled in values.

Thanks in advance!

Best Answer

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    @hatsjikidee,

    OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.

    From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
    you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.

    Hope this helps,

    Regards,

    Lionel

    PS : There is a useful ressource (a book) for you : 
     -  "RapidMiner, Data mining use cases  and business analytics applications", (Chapter 9 : Constructing Recommender  Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
     - the associated extension "Recommenders" (to install from the MarketPlace).

Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Hi @hatsjikidee,

    If you have some descriptive features of your songs, you can build a model based on your labeled data (your rated songs) and then apply this model to the unlabelled data (the unrated songs).

    To help you further can you share your data ? 

    Hope this helps,

    Regards,

    Lionel

  • hatsjikidee
    hatsjikidee New Altair Community Member
    Hi lionel,

    The dataset has 3 attributes:
    Song name - Rating - Name (of the rater)

    Every user has about 40 songs, of which 20 rated and 20 not. So the goal is to predict the missing ones based on what the user rated on the ones he did rate. Hope this gives more clarification.
  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    @hatsjikidee,

    OK, I understand. In theory, your method is the good one....but as you mentionned for each k value, you have different results, but you can not evaluate the "performance" of each prediction.

    From my point of view, to create a real recommender model, you need descriptive features of your song(s). For example
    you need an associated dataset with for each song, its style (pop, rock etc.), its lenght, its author etc.

    Hope this helps,

    Regards,

    Lionel

    PS : There is a useful ressource (a book) for you : 
     -  "RapidMiner, Data mining use cases  and business analytics applications", (Chapter 9 : Constructing Recommender  Systems in RapidMiner) , from Markus Hofmann and Ralf Klinkenberg.
     - the associated extension "Recommenders" (to install from the MarketPlace).
  • Marco_Barradas
    Marco_Barradas
    Altair Employee
    @hatsjikidee as @lionelderkrikor said you need to add more data to the one you already have and then you can predict the rate that the user is going to give. Also read something about how Netflix algorithm works. You may also add some complexity to the analysis by obtaining the lyrics and doing some text minning to obtain the words that are more repeated on the songs and how the presence of them may or may not impact de rating of the user. 
    I don't know if you are doing this as part of a class or just for the fun of doing it but in real life part of being a Data Scientist is analysing the problem, identify the data that may or may not predict an outcome and then extract it from it source. Sometimes the Example Set includes all the attributes you may need and sometimes you need to go out to the internet and find it to enhance your analysis.
    Hope this helps and if you need help texts us and we'll be glad to guide you on the process.

    Best Regards.
  • hatsjikidee
    hatsjikidee New Altair Community Member
    So from what I understand, as far as it is possible I make good predictions with this process. Thank you both for th help and information!