Sentiment Analysis
crimson_crow
New Altair Community Member
Hello! I`m a new one to RapidMiner and I want to learn Sentiment Analysis for my coursework. The purpose is to build a model which can estimate what reviews are: positive, or negative. In program there is an example of the process, but I want to change a couple of things:
1. Replace an example set with my own which has more data
2. Instead of a document with only one review to be estimated by a model I want to use a .xlsx file with reviews which I parsed from IMDb site.
The problems are in "Cross Validation" operator in the screenshot "First Problem", and in "Read Document" operator on the screenshot "Second Problem".
I can`t understand why "Cross Validation" operator has the problem of type because my data has the same structure as in the example, and what operator should I use to read parsed data in .xlsx file correctly?
1. Replace an example set with my own which has more data
2. Instead of a document with only one review to be estimated by a model I want to use a .xlsx file with reviews which I parsed from IMDb site.
The problems are in "Cross Validation" operator in the screenshot "First Problem", and in "Read Document" operator on the screenshot "Second Problem".
I can`t understand why "Cross Validation" operator has the problem of type because my data has the same structure as in the example, and what operator should I use to read parsed data in .xlsx file correctly?
0
Best Answer
-
Hi @crimson_crow,
Thanks for sharing your process and data.
You have to :
- Apply the same pre-processing step(s) in your training branch and in your scoring branch, thus put a Nominal to Text operator (you don't need a Read Document operator) in your score branch.
- Set a Process Document from Data in your scoring branch (like in your training branch)
- Simplify your Cross Validation operator : I just use a SVM model in the training part and use an Apply Model and a Performance (Binominal Classification) in the test part.
In attached file, the working process.
Regards,
Lionel2
Answers
-
Hi @crimson_crow,
Thanks for sharing your process and data.
You have to :
- Apply the same pre-processing step(s) in your training branch and in your scoring branch, thus put a Nominal to Text operator (you don't need a Read Document operator) in your score branch.
- Set a Process Document from Data in your scoring branch (like in your training branch)
- Simplify your Cross Validation operator : I just use a SVM model in the training part and use an Apply Model and a Performance (Binominal Classification) in the test part.
In attached file, the working process.
Regards,
Lionel2 -
Thanks a lot, @lionelderkrikor! That is exactly what I`ve been looking for! The solution came out to be easier than I thought)1