Altair RISE

A program to recognize and reward our most engaged community members

Nominate Yourself Now!

"opinion mining/sentiment analysis-rapidminer5"

hi!
I would appreciate your giving me any piece of information!it is really important to me!
i have created an excel file,filled with comments about a specific topic!
now,i am trying to classify these comments(in fact the comments are short sentences from various sources via the net)
into positive,neutral and negative!
how can i proceed?
please let me inform you that all comments are written in greek.i hope there is no problem with it!
since i am new to this topic i would be really grateful for any help!
thanks in advance, i am looking forward to your reply!

Find more posts tagged with

AI Studio

Sentiment Analysis

Accepted answers

All comments

el_chief

see my blog vancouverdata.blogspot.com

i have a five part video series on text mining, including how to do classification (sentiment analysis) in the 5th part

good luck

neil

lina

Neil McGuigan wrote:

see my blog vancouverdata.blogspot.com

i have a five part video series on text mining, including how to do classification (sentiment analysis) in the 5th part

good luck

neil

thank you very much,neil!i'm going to visit your blog and watch the videos!!

lina

hi!
i'm still working on opinion mining but i have few problems.
i have watched the videos from vancouverdata.btw,i found them really helpfull,thanks neil

!
First of all,the language i use is greek so i want to create a text for the operator: Filter Stopword.Does anyone know how the text should be like? I've created a text like this: "word1|word2...."but unfortunately it is not recognized. Any idea, please?
Also, there is not a stem operator for my language.How can i create one as it seems to be very important?
Apart from these problems, i have followed the method which is showed in the 5th part of the video series but i also have a problem. The operator naive bayes : "cannot check whether input example set has special attribute "label""
What about this?Should i specify a label or an attribute in the file i use?Specifically, i use an excel file instead of database which is used in the video.
Sorry for the long post.
I'm looking forward to your answers and your help!!

Lina

Filter Stopwords by Dictionary allows you to create your own stoplist - it reads from a file that you create.

You can try using regular expressions to create a basic stemmer if the endings of Greek words are consistent for cases and gender.

Here is a simple classifer you can adapt:
http://rapid-i.com/rapidforum/index.php/topic,2993.0.html

Remove the N-Gram operator and change input to Excel. The column that contains the opinion should be set as Label in the Set Role operator.

B.

lina

thank you so much B.i do appreciate yor help!

Filter Stopwords by Dictionary allows you to create your own stoplist - it reads from a file that you create.

i'm trying to create this file but it is not recognized by RapidMiner.what should the form of this file be like?
i've tried something like : "word1|word2..." but it doesn't work!any idea about it?
regarding the classifier and the example given,i'm going to check it out and i hope i manage to classify my own documents!

In Windows it's a txt file.

In rmstop.txt
one
two
three



<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.003" expanded="true" name="Process">
    <process expanded="true" height="386" width="413">
      <operator activated="true" class="text:create_document" compatibility="5.1.001" expanded="true" height="60" name="Create Document" width="90" x="98" y="61">
        <parameter key="text" value="Apples are green and red.&#10;Lemons are yellow.&#10;One lemon and two oranges.&#10;Three apples."/>
        <parameter key="add label" value="true"/>
        <parameter key="label_type" value="text"/>
        <parameter key="label_value" value="textlabel"/>
      </operator>
      <operator activated="true" class="text:documents_to_data" compatibility="5.1.001" expanded="true" height="76" name="Documents to Data" width="90" x="112" y="210">
        <parameter key="text_attribute" value="textlabel"/>
        <parameter key="add_meta_information" value="false"/>
      </operator>
      <operator activated="true" class="text:process_document_from_data" compatibility="5.1.001" expanded="true" height="76" name="Process Documents from Data" width="90" x="313" y="165">
        <parameter key="vector_creation" value="Term Frequency"/>
        <list key="specify_weights"/>
        <process expanded="true" height="505" width="774">
          <operator activated="true" class="text:tokenize" compatibility="5.1.001" expanded="true" height="60" name="Tokenize" width="90" x="112" y="75"/>
          <operator activated="true" class="text:filter_stopwords_dictionary" compatibility="5.1.001" expanded="true" height="60" name="Filter Stopwords (Dictionary)" width="90" x="514" y="75">
            <parameter key="file" value="M:\Data\rmstop.txt"/>
          </operator>
          <connect from_port="document" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Filter Stopwords (Dictionary)" to_port="document"/>
          <connect from_op="Filter Stopwords (Dictionary)" from_port="document" to_port="document 1"/>
          <portSpacing port="source_document" spacing="0"/>
          <portSpacing port="sink_document 1" spacing="0"/>
          <portSpacing port="sink_document 2" spacing="0"/>
        </process>
      </operator>
      <connect from_op="Create Document" from_port="output" to_op="Documents to Data" to_port="documents 1"/>
      <connect from_op="Documents to Data" from_port="example set" to_op="Process Documents from Data" to_port="example set"/>
      <connect from_op="Process Documents from Data" from_port="example set" to_port="result 1"/>
      <connect from_op="Process Documents from Data" from_port="word list" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>

frito

lina wrote:

thank you so much B.i do appreciate yor help!

Filter Stopwords by Dictionary allows you to create your own stoplist - it reads from a file that you create.

i'm trying to create this file but it is not recognized by RapidMiner.what should the form of this file be like?
i've tried something like : "word1|word2..." but it doesn't work!any idea about it?
regarding the classifier and the example given,i'm going to check it out and i hope i manage to classify my own documents!

create an ascii file with txt or csv extension
sample of the file data structure:

attrib1,attrib2,attrib3
apple,monkey,brick
orange,monkey,stick