Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Regression unable to use polynomial label(or any label)
green_duck
Hello all,
new here and new to RM(which will be made obvious shortly). So, i'm trying to do a simple regression analysis based on a attribute(label) as(-1,0,1). I've followed the steps provided to me, but every time I input a regression operator, I get an error saying the operator cannot handle polynomial or numerical labels. I'm stumped.
Any help would be greatly appreciated! Thanks! <?xml version="1.0" encoding="UTF-8"?>
Find more posts tagged with
AI Studio
Classification
Errors
Regression
Accepted answers
lionelderkrikor
@green_duck
,
In attached file, the working process.
How said previously, you have a classification problem, thus you need a classifier model (Here I used a
Naive Bayes
model).
The
Linear Model
you used is dedicated to regression task(s) and thus raised an error in your case.
To go further and to find the best model for your use case, I advice you to use the
Auto-Model
tool : Click on
Auto-Model
, submit your data, choose
Predict
and select your label attribute (in your case "sentiment") and then follow the indications.
Good luck !
Hope this helps,
Regards,
Lionel
Sentiment_analysis_tweets.rmp
lionelderkrikor
@green_duck
,
I'm not specialist of Deep-learning but as said previously, you can start with the
Deep learning
model proposed in
Auto-Model
with the default parameters.
Don't forget to enable the option
Turn into Classification
.
You will obtain a first performance. Then you can play with the structure of the neural network by adding or removing hidden layer(s), and/or playing with the number of epochs (increase this parameter), modify the activation function etc. and see if you can improve the performance of your process.
Hope this helps,
Regards,
Lionel
All comments
lionelderkrikor
Hi
@green_duck
,
If your attribute(label) has (-1,0,1) as values, it is a classification problem and not a regression problem.
A regression problem is characterized by a continuous attribute(label).
Can you provide your process and your data in order we can fix your error ?
Regards,
Lionel
green_duck
Hi Lionel,
Thanks for getting back to me - I had a feeling this may have been the case as I was also attempting to use cross-validation but I couldn't get the operator to work either. I've attached the data(should've done this earlier).
XML below:
<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.6.000" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" breakpoints="after" class="retrieve" compatibility="9.6.000" expanded="true" height="68" name="Retrieve Tweets_sequence" width="90" x="45" y="34">
<parameter key="repository_entry" value="data/Tweets_sequence"/>
</operator>
<operator activated="true" class="subprocess" compatibility="9.6.000" expanded="true" height="103" name="Subprocess" width="90" x="179" y="85">
<process expanded="true">
<operator activated="true" class="select_attributes" compatibility="9.6.000" expanded="true" height="82" name="Select Attributes" width="90" x="179" y="34">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value="sentiment"/>
<parameter key="attributes" value="1_word|2_word|3_word|4_word|5_word|6_word|7_word|8_word|9_word|10_word|11_word|12_word|13_word|14_word|15_word|16_word|17_word|18_word|19_word|20_word|21_word|22_word|23_word|24_word|25_word|26_word|27_word|28_word|29_word|30_word"/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<operator activated="true" class="numerical_to_polynominal" compatibility="9.6.000" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="313" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="sentiment"/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
</operator>
<operator activated="true" class="set_role" compatibility="9.6.000" expanded="true" height="82" name="Set Role" width="90" x="447" y="34">
<parameter key="attribute_name" value="sentiment"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="split_data" compatibility="9.6.000" expanded="true" height="103" name="Split Data" width="90" x="849" y="34">
<enumeration key="partitions">
<parameter key="ratio" value="0.8"/>
<parameter key="ratio" value="0.2"/>
</enumeration>
<parameter key="sampling_type" value="shuffled sampling"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
</operator>
<connect from_port="in 1" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
<connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_port="out 1"/>
<connect from_op="Split Data" from_port="partition 2" to_port="out 2"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
<portSpacing port="sink_out 3" spacing="0"/>
</process>
</operator>
<operator activated="true" class="linear_regression" compatibility="9.6.000" expanded="true" height="103" name="Linear Regression" width="90" x="313" y="85">
<parameter key="feature_selection" value="M5 prime"/>
<parameter key="alpha" value="0.05"/>
<parameter key="max_iterations" value="10"/>
<parameter key="forward_alpha" value="0.05"/>
<parameter key="backward_alpha" value="0.05"/>
<parameter key="eliminate_colinear_features" value="true"/>
<parameter key="min_tolerance" value="0.05"/>
<parameter key="use_bias" value="true"/>
<parameter key="ridge" value="1.0E-8"/>
</operator>
<operator activated="true" class="apply_model" compatibility="9.6.000" expanded="true" height="82" name="Apply Model" width="90" x="581" y="238">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
<operator activated="true" class="performance" compatibility="9.6.000" expanded="true" height="82" name="Performance" width="90" x="782" y="136">
<parameter key="use_example_weights" value="true"/>
</operator>
<connect from_op="Retrieve Tweets_sequence" from_port="output" to_op="Subprocess" to_port="in 1"/>
<connect from_op="Subprocess" from_port="out 2" to_op="Linear Regression" to_port="training set"/>
<connect from_op="Linear Regression" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tweets_sequence.csv
lionelderkrikor
@green_duck
,
In attached file, the working process.
How said previously, you have a classification problem, thus you need a classifier model (Here I used a
Naive Bayes
model).
The
Linear Model
you used is dedicated to regression task(s) and thus raised an error in your case.
To go further and to find the best model for your use case, I advice you to use the
Auto-Model
tool : Click on
Auto-Model
, submit your data, choose
Predict
and select your label attribute (in your case "sentiment") and then follow the indications.
Good luck !
Hope this helps,
Regards,
Lionel
Sentiment_analysis_tweets.rmp
green_duck
@lionelderkrikor
Thank you so much! This was very helpful - Just have one last question - are there any deep learning models(NNs) that you would suggest for this same dataset?
lionelderkrikor
@green_duck
,
I'm not specialist of Deep-learning but as said previously, you can start with the
Deep learning
model proposed in
Auto-Model
with the default parameters.
Don't forget to enable the option
Turn into Classification
.
You will obtain a first performance. Then you can play with the structure of the neural network by adding or removing hidden layer(s), and/or playing with the number of epochs (increase this parameter), modify the activation function etc. and see if you can improve the performance of your process.
Hope this helps,
Regards,
Lionel
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups