A program to recognize and reward our most engaged community members
1) my text is filtered against a set of English stop words and some words are pruned (Ex. and, or..). - I have to work with texts on biology and so I'm wondering what happens with strange words such as IL-6. Are these words filtered or maintained?
2) The stemmer keeps only the "basic chunks" of my words. I think that this is based on a dictionary. - Could you tell me which dictionary is that? I need to know that precisely in order to answer to the question "does it contain some medical terms such as glicolase..?" that is crucial for me now - What does it happen to my strange word (Ex. IL-6)? Are they pruned, chunked in some way or kept as they are?