A strange result by repeating invoking the apply() method
Hi,
I am building a text classifier,
Because nothing is changed about the data and the learner, the results should be the same, in my opinion.
Sincerely yours,
gfyang
I am building a text classifier,
xValidation.apply(container) is invoked 3 times, giving 3 completely different results. WHY?
// build the text input
OperatorChain textInput = (OperatorChain) OperatorService.createOperator("TextInput");
List<String[]> para = new ArrayList<String[]>();
String[] para1 = {"graphics", "c:/data/Reuters/acq"};
String[] para2 = {"hardware", "c:/data/Reuters/corn"};
String[] para3 = {"hardware", "c:/data/Reuters/crude"};
String[] para4 = {"hardware", "c:/data/Reuters/earn"};
String[] para5 = {"hardware", "c:/data/Reuters/grain"};
para.add(para1);
para.add(para2);
para.add(para3);
para.add(para4);
para.add(para5);
textInput.setListParameter("texts", para);
textInput.setParameter("prune_below", "3");
textInput.setParameter("output_word_list", "d:/test/word.list");
Operator stringTokenizer = OperatorService.createOperator("StringTokenizer");
Operator stopWord = OperatorService.createOperator("EnglishStopwordFilter");
Operator tokenLen = OperatorService.createOperator("TokenLengthFilter");
tokenLen.setParameter("min_chars", "3");
Operator stemmer = OperatorService.createOperator("PorterStemmer");
Operator gramGenerator = OperatorService.createOperator("TermNGramGenerator");
textInput.addOperator(stringTokenizer);
textInput.addOperator(stopWord);
textInput.addOperator(tokenLen);
textInput.addOperator(stemmer);
textInput.addOperator(gramGenerator);
// build the validation
OperatorChain xValidation = (OperatorChain) OperatorService.createOperator("XValidation");
OperatorChain applierChain = (OperatorChain) OperatorService.createOperator("OperatorChain");
xValidation.setParameter("keep_example_set", "true");
Operator naiveBayes = OperatorService.createOperator("KernelNaiveBayes");
Operator modelApplier = OperatorService.createOperator("ModelApplier");
Operator performance = OperatorService.createOperator("ClassificationPerformance");
performance.setParameter("accuracy", "true");
applierChain.addOperator(modelApplier);
applierChain.addOperator(performance);
xValidation.addOperator(naiveBayes);
xValidation.addOperator(applierChain);
// start applying
IOContainer container = textInput.apply(new IOContainer());
container = xValidation.apply(container);
PerformanceVector pv = container.get(PerformanceVector.class);
double precision = pv.getCriterion("accuracy").getAverage();
// the result is 0.89
container = xValidation.apply(container);
pv = container.get(PerformanceVector.class);
precision = pv.getCriterion("accuracy").getAverage();
// the result is 0.86
container = xValidation.apply(container);
pv = container.get(PerformanceVector.class);
precision = pv.getCriterion("accuracy").getAverage();
// the result is 0.90
Because nothing is changed about the data and the learner, the results should be the same, in my opinion.
Sincerely yours,
gfyang