"opinion mining/sentiment analysis-rapidminer5"
lina
New Altair Community Member
hi!
I would appreciate your giving me any piece of information!it is really important to me!
i have created an excel file,filled with comments about a specific topic!
now,i am trying to classify these comments(in fact the comments are short sentences from various sources via the net)
into positive,neutral and negative!
how can i proceed?
please let me inform you that all comments are written in greek.i hope there is no problem with it!
since i am new to this topic i would be really grateful for any help!
thanks in advance, i am looking forward to your reply!
I would appreciate your giving me any piece of information!it is really important to me!
i have created an excel file,filled with comments about a specific topic!
now,i am trying to classify these comments(in fact the comments are short sentences from various sources via the net)
into positive,neutral and negative!
how can i proceed?
please let me inform you that all comments are written in greek.i hope there is no problem with it!
since i am new to this topic i would be really grateful for any help!
thanks in advance, i am looking forward to your reply!
Tagged:
0
Answers
-
see my blog vancouverdata.blogspot.com
i have a five part video series on text mining, including how to do classification (sentiment analysis) in the 5th part
good luck
neil0 -
thank you very much,neil!i'm going to visit your blog and watch the videos!!Neil McGuigan wrote:
see my blog vancouverdata.blogspot.com
i have a five part video series on text mining, including how to do classification (sentiment analysis) in the 5th part
good luck
neil0 -
hi!
i'm still working on opinion mining but i have few problems.
i have watched the videos from vancouverdata.btw,i found them really helpfull,thanks neil !
First of all,the language i use is greek so i want to create a text for the operator: Filter Stopword.Does anyone know how the text should be like? I've created a text like this: "word1|word2...."but unfortunately it is not recognized. Any idea, please?
Also, there is not a stem operator for my language.How can i create one as it seems to be very important?
Apart from these problems, i have followed the method which is showed in the 5th part of the video series but i also have a problem. The operator naive bayes : "cannot check whether input example set has special attribute "label""
What about this?Should i specify a label or an attribute in the file i use?Specifically, i use an excel file instead of database which is used in the video.
Sorry for the long post.
I'm looking forward to your answers and your help!!0 -
Lina
Filter Stopwords by Dictionary allows you to create your own stoplist - it reads from a file that you create.
You can try using regular expressions to create a basic stemmer if the endings of Greek words are consistent for cases and gender.
Here is a simple classifer you can adapt:
http://rapid-i.com/rapidforum/index.php/topic,2993.0.html
Remove the N-Gram operator and change input to Excel. The column that contains the opinion should be set as Label in the Set Role operator.
B.
0 -
thank you so much B.i do appreciate yor help!
Filter Stopwords by Dictionary allows you to create your own stoplist - it reads from a file that you create.
i'm trying to create this file but it is not recognized by RapidMiner.what should the form of this file be like?
i've tried something like : "word1|word2..." but it doesn't work!any idea about it?
regarding the classifier and the example given,i'm going to check it out and i hope i manage to classify my own documents!0 -
In Windows it's a txt file.
In rmstop.txt
one
two
three
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.003" expanded="true" name="Process">
<process expanded="true" height="386" width="413">
<operator activated="true" class="text:create_document" compatibility="5.1.001" expanded="true" height="60" name="Create Document" width="90" x="98" y="61">
<parameter key="text" value="Apples are green and red. Lemons are yellow. One lemon and two oranges. Three apples."/>
<parameter key="add label" value="true"/>
<parameter key="label_type" value="text"/>
<parameter key="label_value" value="textlabel"/>
</operator>
<operator activated="true" class="text:documents_to_data" compatibility="5.1.001" expanded="true" height="76" name="Documents to Data" width="90" x="112" y="210">
<parameter key="text_attribute" value="textlabel"/>
<parameter key="add_meta_information" value="false"/>
</operator>
<operator activated="true" class="text:process_document_from_data" compatibility="5.1.001" expanded="true" height="76" name="Process Documents from Data" width="90" x="313" y="165">
<parameter key="vector_creation" value="Term Frequency"/>
<list key="specify_weights"/>
<process expanded="true" height="505" width="774">
<operator activated="true" class="text:tokenize" compatibility="5.1.001" expanded="true" height="60" name="Tokenize" width="90" x="112" y="75"/>
<operator activated="true" class="text:filter_stopwords_dictionary" compatibility="5.1.001" expanded="true" height="60" name="Filter Stopwords (Dictionary)" width="90" x="514" y="75">
<parameter key="file" value="M:\Data\rmstop.txt"/>
</operator>
<connect from_port="document" to_op="Tokenize" to_port="document"/>
<connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (Dictionary)" to_port="document"/>
<connect from_op="Filter Stopwords (Dictionary)" from_port="document" to_port="document 1"/>
<portSpacing port="source_document" spacing="0"/>
<portSpacing port="sink_document 1" spacing="0"/>
<portSpacing port="sink_document 2" spacing="0"/>
</process>
</operator>
<connect from_op="Create Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
<connect from_op="Documents to Data" from_port="example set" to_op="Process Documents from Data" to_port="example set"/>
<connect from_op="Process Documents from Data" from_port="example set" to_port="result 1"/>
<connect from_op="Process Documents from Data" from_port="word list" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>0 -
create an ascii file with txt or csv extensionlina wrote:
thank you so much B.i do appreciate yor help!
Filter Stopwords by Dictionary allows you to create your own stoplist - it reads from a file that you create.
i'm trying to create this file but it is not recognized by RapidMiner.what should the form of this file be like?
i've tried something like : "word1|word2..." but it doesn't work!any idea about it?
regarding the classifier and the example given,i'm going to check it out and i hope i manage to classify my own documents!
sample of the file data structure:
attrib1,attrib2,attrib3
apple,monkey,brick
orange,monkey,stick
0