Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
"context/feature based opinion mining/sentiment analysis"
alexjohnp
Hello everybody,
I'm pretty new to Rapidminer, and I'm stuck on the following problem.
I managed to build a simple sentiment classifier following the Pang's theory and the examples on the Internet (especially those on vancouverdata). Now i'd like to extend the concept by extracting the specific features (n-grams) and showing their sentiment score.
For example, let's have the following phrase: "the camera has a pretty good focus, but its flash lacks of speed". I have the two features focus (positive), and flash (negative).
Could you help me get through the pain?
Thank you in advance,
Find more posts tagged with
AI Studio
Sentiment Analysis
Accepted answers
All comments
JEdward
I'm sure there are many ways to look at this.
If your mining is English examples separated by commas, then it's straightforward. You just split on the comma.
Let's assume that you don't have that luxury, however I am going to assume that you have the posts all on the one subject.
So for example:
"the
camera
has a
pretty good focus
but its
flash lacks of speed
"
"The
Canon Sureshot
has a
pretty good focus and flash
, but
tastes awful without ketchup.
"
"I've always
liked the focus
on my
Canon
, but really think the
lightmeter is poor.
"
I'd suggest the following approach (others may disagree):
First I'd add an ID so you can split up the documents in many ways, but still combine them again later.
1: build a list of N-Grams (4-5 max terms long seems about right)
2.1: build a list of features of the subject (flash, focus, shutter, lens, etc).
2.2: build a list of positive & negative terms for labelling. e.g postive: good,pretty,
3.1: eliminate any N-Grams that contain more than one feature.
(this is where I think my approach is wrong)
do you remove "pretty good focus and flash" and just keep "pretty good focus"?
3.2: eliminate any N-Grams that contain conflicting sentiment (e.g. keep "
good focus
but
bad flash
", do not keep "
good focus but bad
"
4: build a sentiment mining model from the N-Grams
5: have a look on the most positive / least positive words in the N-Grams (that aren't features) and see if they should be added to the labelling in step 2.2
After repeating this process a few times on the sample data it should be possible to join your N-Grams up with your list of features to show what the overall sentiment balance is for the individual
e.g. focus 30 / 45 / 25 (positive, negative, neutral).
I won't put together a sample process though as I think there are probably better ideas than mine on here.
puteri_prameswa
Dear Alexjohnp,
I am using RapidMiner for my final thesis about
feature-based sentiment analysis
and I face the same problem like you. However I would like to know if you already find ways to solve it.
Could you explain it to me?
Also thanks JEdward for sharing.
Thank you so much.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups