Naive Bayes Classification of multiple rows
iinnaanncc
New Altair Community Member
Hello everyone,
I am making a naive bayes classification process for some data in RapidMiner. I have a training data to construct a model which has some thousands of rows in the following format.
label attribute attribute attribute attribute attribute attribute
When I want to classify another data which has 3 rows and has following format:
attribute attribute attribute attribute attribute attribute
In this case, everything runs normally and I get a prediction for each row according to naive bayes classification results. (in total I get 3 predictions)
But my question is following: What if I assume that these 3 rows belongs to same category and therefore, I want to get only 1 prediction in total by using these three rows. How can I manage that? Please help me.
I hope I could explain myself.
Thanks in advance,
iinnaanncc
I am making a naive bayes classification process for some data in RapidMiner. I have a training data to construct a model which has some thousands of rows in the following format.
label attribute attribute attribute attribute attribute attribute
When I want to classify another data which has 3 rows and has following format:
attribute attribute attribute attribute attribute attribute
In this case, everything runs normally and I get a prediction for each row according to naive bayes classification results. (in total I get 3 predictions)
But my question is following: What if I assume that these 3 rows belongs to same category and therefore, I want to get only 1 prediction in total by using these three rows. How can I manage that? Please help me.
I hope I could explain myself.
Thanks in advance,
iinnaanncc
Tagged:
0
Answers
-
Hi iinnaanncc,
you can't tell Naive Bayes to give the same label to three rows, but you could classify all three rows separately (as you are doing now), and then return the label which appears most frequently. You can use the Aggregation operator with the "mode" aggregation function as in the example process below.
Best,
Marius<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.017">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.017" expanded="true" name="Process">
<process expanded="true" height="633" width="743">
<operator activated="true" class="generate_data" compatibility="5.1.017" expanded="true" height="60" name="Generate Data" width="90" x="112" y="30">
<parameter key="target_function" value="polynomial classification"/>
<parameter key="number_examples" value="1000"/>
</operator>
<operator activated="true" class="naive_bayes" compatibility="5.1.017" expanded="true" height="76" name="Naive Bayes" width="90" x="246" y="30"/>
<operator activated="true" class="generate_data" compatibility="5.1.017" expanded="true" height="60" name="Generate Data (2)" width="90" x="112" y="120">
<parameter key="target_function" value="polynomial classification"/>
<parameter key="number_examples" value="3"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.017" expanded="true" height="76" name="Apply Model" width="90" x="380" y="75">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="aggregate" compatibility="5.1.017" expanded="true" height="76" name="Aggregate" width="90" x="514" y="30">
<list key="aggregation_attributes">
<parameter key="prediction(label)" value="mode"/>
</list>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Naive Bayes" to_port="training set"/>
<connect from_op="Naive Bayes" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Generate Data (2)" from_port="output" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Aggregate" to_port="example set input"/>
<connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>0