Recommender System on Rapidminer
hattan
New Altair Community Member
Hi
I'm trying to build a recommendation system to recommend goals for users based on their goal list..
it's my senior project I'm working on it by my own, and running out of time,
so pleeeeease I need any avilable help or advice you can provide me with ;
I have my dataset (125 user ,7 category,1800 goal name) as access database
I have made the cluster for each goal category separately, got 5 clusters group for each category
I made a lot of manual work, I renamed clusters name, made crosstab query for clusters name with user ID,did other modification to get binominal values as showen in here (f=means they don't have goal in the cluster,t=mean they do)
then in an other process I apply the association rule for frequent items as in here:
I know how terrible this is, I even didn't get final certain recommended goal!!
my interface will be as webpage
how can I fix my process ??
How can i do the association analysis directly with the output of the clustering operator? Without the need of all my manual work?
I have every category having its own process how can i group them together?
how should the process be when new user enter new goal
What it should go through?!
is there a better way to build my system?!
any help will be very much appreciated
regards
I'm trying to build a recommendation system to recommend goals for users based on their goal list..
it's my senior project I'm working on it by my own, and running out of time,
so pleeeeease I need any avilable help or advice you can provide me with ;
I have my dataset (125 user ,7 category,1800 goal name) as access database
I have made the cluster for each goal category separately, got 5 clusters group for each category
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.014">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.014" expanded="true" name="Process">
<process expanded="true" height="431" width="681">
<operator activated="true" class="retrieve" compatibility="5.1.014" expanded="true" height="60" name="Retrieve" width="90" x="6" y="38">
<parameter key="repository_entry" value="Travel & Entertainment"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.1.014" expanded="true" height="76" name="Select Attributes" width="90" x="45" y="120">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attribute" value="Goal Name"/>
<parameter key="attributes" value="Goal Name|Goal-ID|User_ID"/>
</operator>
<operator activated="true" class="nominal_to_text" compatibility="5.1.014" expanded="true" height="76" name="Nominal to Text" width="90" x="45" y="210">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Goal Name"/>
</operator>
<operator activated="true" class="text:process_document_from_data" compatibility="5.1.004" expanded="true" height="76" name="Process Documents from Data" width="90" x="45" y="300">
<parameter key="vector_creation" value="Binary Term Occurrences"/>
<list key="specify_weights"/>
<process expanded="true" height="370" width="563">
<operator activated="true" class="text:tokenize" compatibility="5.1.004" expanded="true" height="60" name="Tokenize" width="90" x="45" y="30"/>
<operator activated="true" class="text:transform_cases" compatibility="5.1.004" expanded="true" height="60" name="Transform Cases" width="90" x="45" y="120"/>
<operator activated="true" class="text:filter_stopwords_english" compatibility="5.1.004" expanded="true" height="60" name="Filter Stopwords (English)" width="90" x="315" y="30"/>
<operator activated="true" class="text:stem_porter" compatibility="5.1.004" expanded="true" height="60" name="Stem (Porter)" width="90" x="448" y="30"/>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_op="Transform Cases" to_port="document"/>
<connect from_op="Transform Cases" from_port="document" to_op="Filter Stopwords (English)" to_port="document"/>
<connect from_op="Filter Stopwords (English)" from_port="document" to_op="Stem (Porter)" to_port="document"/>
<connect from_op="Stem (Porter)" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="k_means" compatibility="5.1.014" expanded="true" height="76" name="Clustering" width="90" x="179" y="30">
<parameter key="k" value="5"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.1.014" expanded="true" height="76" name="Select Attributes (2)" width="90" x="313" y="255">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="|Goal-ID|User_ID|cluster"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Nominal to Text" to_port="example set input"/>
<connect from_op="Nominal to Text" from_port="example set output" to_op="Process Documents from Data" to_port="example set"/>
<connect from_op="Process Documents from Data" from_port="example set" to_op="Clustering" to_port="example set"/>
<connect from_op="Clustering" from_port="clustered set" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
I made a lot of manual work, I renamed clusters name, made crosstab query for clusters name with user ID,did other modification to get binominal values as showen in here (f=means they don't have goal in the cluster,t=mean they do)
then in an other process I apply the association rule for frequent items as in here:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.014">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.014" expanded="true" name="Process">
<process expanded="true" height="390" width="614">
<operator activated="true" class="retrieve" compatibility="5.1.014" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
<parameter key="repository_entry" value="read User_Cluster"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="5.1.014" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="75">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="User-ID"/>
<parameter key="invert_selection" value="true"/>
</operator>
<operator activated="true" class="fp_growth" compatibility="5.1.014" expanded="true" height="76" name="FP-Growth" width="90" x="313" y="75">
<parameter key="min_support" value="0.5"/>
</operator>
<operator activated="true" class="create_association_rules" compatibility="5.1.014" expanded="true" height="76" name="Create Association Rules" width="90" x="447" y="75">
<parameter key="min_confidence" value="0.5"/>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
<connect from_op="FP-Growth" from_port="frequent sets" to_op="Create Association Rules" to_port="item sets"/>
<connect from_op="Create Association Rules" from_port="rules" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
I know how terrible this is, I even didn't get final certain recommended goal!!
my interface will be as webpage
how can I fix my process ??
How can i do the association analysis directly with the output of the clustering operator? Without the need of all my manual work?
I have every category having its own process how can i group them together?
how should the process be when new user enter new goal
What it should go through?!
is there a better way to build my system?!
any help will be very much appreciated
regards
Tagged:
0