Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
"Is it possible to see the stopword list from the extension Text Procession?"
Prentice
I want to see the English word list from the stopword operator. Because I need to filter one stopword out, "no". In my case this is not a stopword and gives crucial information about something. For example: "no fault found" would be converted to "fault found". I've looked online for stop word lists but they differ a lot, so I want to see what the extension uses.
Find more posts tagged with
AI Studio
Extensions
Text Mining + NLP
Accepted answers
rfuentealba
Hello
@Prentice
,
For your case, I think it is better to get a list of words, see which ones are stopwords for your case and build a custom dictionary. I haven't been able to find the list of stopwords either, so that's what I do. I suspect that the list of stopwords basically includes all the common words that don't correspond to verbs, adjectives or nouns, and that the use case is to extract context, rather than interpreting meaning.
With context I mean: both sentences "food is good" and "food is not good", after being stripped from "is" and "not", are talking about food quality; Meaning is separating between "food is good" and "food is bad". I explain it here because I'm not a native English speaker and probably it's not the way these words are used.
Hope this helps,
Rodrigo.
All comments
rfuentealba
Hello
@Prentice
,
For your case, I think it is better to get a list of words, see which ones are stopwords for your case and build a custom dictionary. I haven't been able to find the list of stopwords either, so that's what I do. I suspect that the list of stopwords basically includes all the common words that don't correspond to verbs, adjectives or nouns, and that the use case is to extract context, rather than interpreting meaning.
With context I mean: both sentences "food is good" and "food is not good", after being stripped from "is" and "not", are talking about food quality; Meaning is separating between "food is good" and "food is bad". I explain it here because I'm not a native English speaker and probably it's not the way these words are used.
Hope this helps,
Rodrigo.
Prentice
Thanks a lot. I will do this, I was just wondering if that list may have had some words which I can't find. But I guess this'll have to do.
IngoRM
BTW, I personally do not really use stopword filtering any more but actually use frequency based pruning instead which delivers similar results but I find it easier to tune to keep things in (or out) as desired... That might be another idea to try instead.
Cheers,
Ingo
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups