"Some help when training a regression algorithm"

New Altair Community Member

Jun 5, 2012

Updated Nov 5, 2024 by Jocelyn

Hi dear rapid-i coommunity,

I am traying the rapidminer modeling to make a content-based recommender system. To do that i downloaded the movielens 100K dataset which have information about movies and ratings made by users to movies. The ratings have a range bettween 0 and 5 and the movies has genre information (action, commedy, etc). I am training a classifier using the user who has more ratings (uid= 405; Number of reviews= 737) for doing that i discretize the rating (good >= 3.5; bad < 3.5) but due that the user has a lot of more reviews with label bad the classifier (libSVM) predicts all labels as bad.

Then i used another strategy where i did stratified sampling (http://rapid-i.com/rapidforum/index.php/topic,2190.0.html) to make class labels even. I get the following results

true bad true good class precision
pre.bad
pre.good
class recall