"How to improve Classification in Text Mining"

User: "mdc"
New Altair Community Member
Updated by Jocelyn
I'm doing classification (15 classes) of technical papers using their abstract.

My processes are simple.

Learning:
+ TextInput
  + String Tokenizer
  + English StopwordFilter
  +TokenLengthFilter
+ Binary2MultiClassLearner
  +LibSVMLearner
+ModelWriter

Applying:
+TextInput
  + String Tokenizer
  + English StopwordFilter
  +TokenLengthFilter
+ModelLoader
+ModelApplier
+ExcelExampleSetWriter

I get results but I'm not satisfied with them. How do I improve them?  ???

I've been searching the forum and seen that feature selection is one way. There are lots of examples of FeatureSelection operator uses but I couldn't find one that writes to a model file. One example from the installer is shown but I couldn't figure out where to add the ModelWriter. Or am I thinking wrong?  ???
....
+ FeatureSelection
  +XValidation
      +NearestNeighbors
      +OperatorChain
          +ModelApplier
          +Performance
  +ProcessLog

I'm also thinking of forcing some attributes with bigger weights. Is this a good thing to do and how do I do this?

thanks,
Matthew

Find more posts tagged with