Sentiment Analysis

crimson_crow
crimson_crow New Altair Community Member
edited November 5 in Community Q&A
Hello! I`m a new one to RapidMiner and I want to learn Sentiment Analysis for my coursework. The purpose is to build a model which can estimate what reviews are: positive, or negative.  In program there is an example of the process, but I want to change a couple of things:
1. Replace an example set with my own which has more data
2. Instead of a document with only one review to be estimated by a model I want to use a .xlsx file with reviews which I parsed from IMDb site.
The problems are in "Cross Validation" operator in the screenshot "First Problem", and in "Read Document" operator on the screenshot "Second Problem".
I can`t understand why "Cross Validation" operator has the problem of type because my data has the same structure as in the example, and what operator should I use to  read parsed data in .xlsx file correctly?

Best Answer

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    Hi @crimson_crow,

    Thanks for sharing your process and data.
    You have to  : 
     - Apply the same pre-processing step(s) in your training branch and in your scoring branch, thus put a Nominal to Text operator (you don't need a Read Document operator) in your score branch.
     - Set a Process Document from Data in your scoring branch (like in your training branch)
     - Simplify your Cross Validation operator : I just use a SVM model in the training part and use an Apply Model and a Performance (Binominal Classification) in the test part.

    In attached file, the working process.

    Regards,

    Lionel 

Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    Answer ✓
    Hi @crimson_crow,

    Thanks for sharing your process and data.
    You have to  : 
     - Apply the same pre-processing step(s) in your training branch and in your scoring branch, thus put a Nominal to Text operator (you don't need a Read Document operator) in your score branch.
     - Set a Process Document from Data in your scoring branch (like in your training branch)
     - Simplify your Cross Validation operator : I just use a SVM model in the training part and use an Apply Model and a Performance (Binominal Classification) in the test part.

    In attached file, the working process.

    Regards,

    Lionel 
  • crimson_crow
    crimson_crow New Altair Community Member
    Thanks a lot, @lionelderkrikor! That is exactly what I`ve been looking for! The solution came out to be easier than I thought)