Regression unable to use polynomial label(or any label)

green_duck
New Altair Community Member
Hello all,
new here and new to RM(which will be made obvious shortly). So, i'm trying to do a simple regression analysis based on a attribute(label) as(-1,0,1). I've followed the steps provided to me, but every time I input a regression operator, I get an error saying the operator cannot handle polynomial or numerical labels. I'm stumped.
Any help would be greatly appreciated! Thanks! <?xml version="1.0" encoding="UTF-8"?>
new here and new to RM(which will be made obvious shortly). So, i'm trying to do a simple regression analysis based on a attribute(label) as(-1,0,1). I've followed the steps provided to me, but every time I input a regression operator, I get an error saying the operator cannot handle polynomial or numerical labels. I'm stumped.
Any help would be greatly appreciated! Thanks! <?xml version="1.0" encoding="UTF-8"?>
Tagged:
0
Best Answers
-
@green_duck,
In attached file, the working process.
How said previously, you have a classification problem, thus you need a classifier model (Here I used a Naive Bayes model).
The Linear Model you used is dedicated to regression task(s) and thus raised an error in your case.
To go further and to find the best model for your use case, I advice you to use the Auto-Model tool : Click on Auto-Model, submit your data, choose Predict and select your label attribute (in your case "sentiment") and then follow the indications.
Good luck !
Hope this helps,
Regards,
Lionel
-1 -
@green_duck,
I'm not specialist of Deep-learning but as said previously, you can start with the Deep learning model proposed in Auto-Model with the default parameters.
Don't forget to enable the option Turn into Classification.
You will obtain a first performance. Then you can play with the structure of the neural network by adding or removing hidden layer(s), and/or playing with the number of epochs (increase this parameter), modify the activation function etc. and see if you can improve the performance of your process.
Hope this helps,
Regards,
Lionel
1
Answers
-
Hi @green_duck,
If your attribute(label) has (-1,0,1) as values, it is a classification problem and not a regression problem.
A regression problem is characterized by a continuous attribute(label).
Can you provide your process and your data in order we can fix your error ?
Regards,
Lionel0 -
Hi Lionel,
Thanks for getting back to me - I had a feeling this may have been the case as I was also attempting to use cross-validation but I couldn't get the operator to work either. I've attached the data(should've done this earlier).
XML below:<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000"><context><input/><output/><macros/></context><operator activated="true" class="process" compatibility="9.6.000" expanded="true" name="Process"><parameter key="logverbosity" value="init"/><parameter key="random_seed" value="2001"/><parameter key="send_mail" value="never"/><parameter key="notification_email" value=""/><parameter key="process_duration_for_mail" value="30"/><parameter key="encoding" value="SYSTEM"/><process expanded="true"><operator activated="true" breakpoints="after" class="retrieve" compatibility="9.6.000" expanded="true" height="68" name="Retrieve Tweets_sequence" width="90" x="45" y="34"><parameter key="repository_entry" value="data/Tweets_sequence"/></operator><operator activated="true" class="subprocess" compatibility="9.6.000" expanded="true" height="103" name="Subprocess" width="90" x="179" y="85"><process expanded="true"><operator activated="true" class="select_attributes" compatibility="9.6.000" expanded="true" height="82" name="Select Attributes" width="90" x="179" y="34"><parameter key="attribute_filter_type" value="all"/><parameter key="attribute" value="sentiment"/><parameter key="attributes" value="1_word|2_word|3_word|4_word|5_word|6_word|7_word|8_word|9_word|10_word|11_word|12_word|13_word|14_word|15_word|16_word|17_word|18_word|19_word|20_word|21_word|22_word|23_word|24_word|25_word|26_word|27_word|28_word|29_word|30_word"/><parameter key="use_except_expression" value="false"/><parameter key="value_type" value="attribute_value"/><parameter key="use_value_type_exception" value="false"/><parameter key="except_value_type" value="time"/><parameter key="block_type" value="attribute_block"/><parameter key="use_block_type_exception" value="false"/><parameter key="except_block_type" value="value_matrix_row_start"/><parameter key="invert_selection" value="false"/><parameter key="include_special_attributes" value="true"/></operator><operator activated="true" class="numerical_to_polynominal" compatibility="9.6.000" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="313" y="34"><parameter key="attribute_filter_type" value="single"/><parameter key="attribute" value="sentiment"/><parameter key="attributes" value=""/><parameter key="use_except_expression" value="false"/><parameter key="value_type" value="numeric"/><parameter key="use_value_type_exception" value="false"/><parameter key="except_value_type" value="real"/><parameter key="block_type" value="value_series"/><parameter key="use_block_type_exception" value="false"/><parameter key="except_block_type" value="value_series_end"/><parameter key="invert_selection" value="false"/><parameter key="include_special_attributes" value="false"/></operator><operator activated="true" class="set_role" compatibility="9.6.000" expanded="true" height="82" name="Set Role" width="90" x="447" y="34"><parameter key="attribute_name" value="sentiment"/><parameter key="target_role" value="label"/><list key="set_additional_roles"/></operator><operator activated="true" class="split_data" compatibility="9.6.000" expanded="true" height="103" name="Split Data" width="90" x="849" y="34"><enumeration key="partitions"><parameter key="ratio" value="0.8"/><parameter key="ratio" value="0.2"/></enumeration><parameter key="sampling_type" value="shuffled sampling"/><parameter key="use_local_random_seed" value="false"/><parameter key="local_random_seed" value="1992"/></operator><connect from_port="in 1" to_op="Select Attributes" to_port="example set input"/><connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/><connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Set Role" to_port="example set input"/><connect from_op="Set Role" from_port="example set output" to_op="Split Data" to_port="example set"/><connect from_op="Split Data" from_port="partition 1" to_port="out 1"/><connect from_op="Split Data" from_port="partition 2" to_port="out 2"/><portSpacing port="source_in 1" spacing="0"/><portSpacing port="source_in 2" spacing="0"/><portSpacing port="sink_out 1" spacing="0"/><portSpacing port="sink_out 2" spacing="0"/><portSpacing port="sink_out 3" spacing="0"/></process></operator><operator activated="true" class="linear_regression" compatibility="9.6.000" expanded="true" height="103" name="Linear Regression" width="90" x="313" y="85"><parameter key="feature_selection" value="M5 prime"/><parameter key="alpha" value="0.05"/><parameter key="max_iterations" value="10"/><parameter key="forward_alpha" value="0.05"/><parameter key="backward_alpha" value="0.05"/><parameter key="eliminate_colinear_features" value="true"/><parameter key="min_tolerance" value="0.05"/><parameter key="use_bias" value="true"/><parameter key="ridge" value="1.0E-8"/></operator><operator activated="true" class="apply_model" compatibility="9.6.000" expanded="true" height="82" name="Apply Model" width="90" x="581" y="238"><list key="application_parameters"/><parameter key="create_view" value="false"/></operator><operator activated="true" class="performance" compatibility="9.6.000" expanded="true" height="82" name="Performance" width="90" x="782" y="136"><parameter key="use_example_weights" value="true"/></operator><connect from_op="Retrieve Tweets_sequence" from_port="output" to_op="Subprocess" to_port="in 1"/><connect from_op="Subprocess" from_port="out 2" to_op="Linear Regression" to_port="training set"/><connect from_op="Linear Regression" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/><connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/><connect from_op="Performance" from_port="performance" to_port="result 1"/><portSpacing port="source_input 1" spacing="0"/><portSpacing port="sink_result 1" spacing="0"/><portSpacing port="sink_result 2" spacing="0"/></process></operator></process>0 -
@green_duck,
In attached file, the working process.
How said previously, you have a classification problem, thus you need a classifier model (Here I used a Naive Bayes model).
The Linear Model you used is dedicated to regression task(s) and thus raised an error in your case.
To go further and to find the best model for your use case, I advice you to use the Auto-Model tool : Click on Auto-Model, submit your data, choose Predict and select your label attribute (in your case "sentiment") and then follow the indications.
Good luck !
Hope this helps,
Regards,
Lionel
-1 -
@lionelderkrikor
Thank you so much! This was very helpful - Just have one last question - are there any deep learning models(NNs) that you would suggest for this same dataset?0 -
@green_duck,
I'm not specialist of Deep-learning but as said previously, you can start with the Deep learning model proposed in Auto-Model with the default parameters.
Don't forget to enable the option Turn into Classification.
You will obtain a first performance. Then you can play with the structure of the neural network by adding or removing hidden layer(s), and/or playing with the number of epochs (increase this parameter), modify the activation function etc. and see if you can improve the performance of your process.
Hope this helps,
Regards,
Lionel
1