Hi all,
in Java code I would like to create an exampleset with the textplugin. I tried WVToolRapidMinerExample.java with my input and it works fine. I copied the exact code of this example to my method and I get an acces denied error. Debugging showed that WVToolRapidMinerExample.java treats my input as a directory containing traindocuments, as it should, but when I use my method directories are treated as files which of course results in an exception.
Here is my code, RapidMiner is initialised when this code is reached;
private ExampleSet buildTrainExampleSetNieuw(Category category)
throws OperatorCreationException, OperatorException {
OperatorChain wvtoolOperator = (OperatorChain) OperatorService
.createOperator("TextInput");
wvtoolOperator.setParameter(TextInput.PARAMETER_DEFAULT_CONTENT_TYPE,
"application/xml");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_LANGUAGE, "dutch");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_ENCODING, "iso-8859-1");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_BELOW, "3");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_ABOVE, "10");
List<Object[]> textList = new LinkedList<Object[]>();
textList
.add(new Object[] { "Ambtenarenrecht",
"c:/workspace/documentclassification/trainset/Ambtenarenrecht/" });
textList
.add(new Object[] { "non-Ambtenarenrecht",
"c:/workspace/documentclassification/trainsetnon/Ambtenarenrecht/" });
wvtoolOperator.addOperator(OperatorService
.createOperator(SimpleTokenizer.class));
wvtoolOperator.setListParameter("texts", textList);
IOContainer out = wvtoolOperator.apply(new IOContainer());
return out.get(ExampleSet.class);
}
This is the code in WVToolRapidMinerExample.java that does a good job;
public static void main(String[] args) throws Exception {
FileInputStream inputStream = new FileInputStream(
"C:\\workspace\\textplugin\\resources\\operators.xml");
RapidMiner.init(inputStream, new File("rm_plugins"), true, false,
false, true);
inputStream.close();
OperatorChain wvtoolOperator = (OperatorChain) OperatorService
.createOperator("TextInput");
wvtoolOperator.setParameter(TextInput.PARAMETER_DEFAULT_CONTENT_TYPE,
"application/xml");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_LANGUAGE, "dutch");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_ENCODING, "iso-8859-1");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_BELOW, "3");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_ABOVE, "10");
List<Object[]> textList = new LinkedList<Object[]>();
// adjust data input
textList
.add(new Object[] { "Ambtenarenrecht",
"c:/workspace/documentclassification/trainset/Ambtenarenrecht/" });
textList
.add(new Object[] { "non-Ambtenarenrecht",
"c:/workspace/documentclassification/trainsetnon/Ambtenarenrecht/" });
wvtoolOperator.addOperator(OperatorService
.createOperator(SimpleTokenizer.class));
wvtoolOperator.setListParameter("texts", textList);
IOContainer out = wvtoolOperator.apply(new IOContainer());
System.out.println("klaar");
}
Any ideas on what I am doing wrong in my code? I use Rapidminer/textplugin 4.2. Any suggestions that help solve this problem will be much appreciated.
Martine