transform data for rapidMiner - inference stage

chenUser4321
chenUser4321 New Altair Community Member
edited November 5 in Community Q&A
I am using RM 4.6 to transform data for RM using my own application data directly as input.

Figure 7.4 (shown below) in rapidMiner-4.6-tutorial.pdf gives an example of classifier training and inference. For learning
Model model = learner. learn (exampleSet);
uses an ExampleSet object as input, section 7.6 tells how to do the data transformation for this object class. However, for inference,
container = modelApp.apply(container);
uses an IOContainer object as input. But I cannot find any source from the tutorial that tells how to transform data to this object class.

So how can we do the data transform for the object class in inference stage?

By the way, besides the code below from Figure 7.4 that tells how to train and test a classifier, is there any other methods and sample code that do this?

Any comment is greatly appreciated!
	public static void main(String [] args) {
try {
RapidMiner. init ();
// learn
Operator exampleSource =
OperatorService . createOperator(ExampleSource.class);
exampleSource.setParameter(” attributes ”,
”/path/to/your/training data .xml”);
IOContainer container = exampleSource.apply(new IOContainer());
ExampleSet exampleSet = container.get(ExampleSet.class);
// here the string based creation must be used since the J48 operator
// do not have an own class ( derived from the Weka library ).
Learner learner = (Learner)OperatorService . createOperator(”J48”);
Model model = learner. learn (exampleSet);
// loading the test set ( plus adding the model to result container )
Operator testSource =
OperatorService . createOperator(ExampleSource.class);
testSource .setParameter(” attributes ”, ”/path/to/your/test data .xml”);method and
container = testSource.apply(new IOContainer());
container = container.append(model);
// applying the model
Operator modelApp =
OperatorService . createOperator(ModelApplier. class );
container = modelApp.apply(container);
// print results
ExampleSet resultSet = container. get(ExampleSet.class );
Attribute predictedLabel = resultSet . getPredictedLabel ();
ExampleReader reader = resultSet.getExampleReader();
while (reader .hasNext()) {
System.out. println (reader .next (). getValueAsString( predictedLabel ));
}
} catch (IOException e) {
System.err . println (”Cannot initialize RapidMiner:” + e.getMessage());
} catch (OperatorCreationException e) {
System.err . println (”Cannot create operator:” + e.getMessage());
} catch (OperatorException e) {
System.err . println (”Cannot create model: ” + e.getMessage());
}
}

Answers

  • fischer
    fischer New Altair Community Member
    Hi,

    you can simply create a new IOContainer() and add your model to it. It's that simple.

    Best,
    Simon
  • chenUser4321
    chenUser4321 New Altair Community Member
    Thanks for the reply!

    However, how to transform data from my application to an IOContainer?

    In the sample code,

    Operator testSource =
    OperatorService . createOperator(ExampleSource.class);
    testSource .setParameter(” attributes ”, ”/path/to/your/test data .xml”);
    container = testSource.apply(new IOContainer());
    container = container.append(model);
    a testSource with test data file path is specified to create the IOContainer. Correspondingly, the data from an application is also needed to transform to an IOContainer object before the model is applied.

    So how to create such IOContainer? Should we create an ExampleSet and then transform it to be an IOContainer? If so, what is the proper way to do this? I do not see an obvious way according to the sample code.

    Any information is sincerely appreciated!
  • fischer
    fischer New Altair Community Member
    Hi,

    sorry, I'm not sure I understand what you are asking. You don't transform IOObjects (like ExampleSets) into IOContainers, you create an IOContainer and append the IOObject to it. It is a container.

    If you are asking how you can convert your own Java data structure into an ExampleSet, then the answer is: Create Attributes, make a MemoryExampleTable from them, and start populating it with DataRows. Finally, use one of the createExampleSet-methods of the ExampleTable to make your ExampleSet.

    Best,
    Simon
  • chenUser4321
    chenUser4321 New Altair Community Member
    I have some basic idea to do it. Thanks for the reply!
  • chenUser4321
    chenUser4321 New Altair Community Member
    Thank Simon for the reply.

    I have a new question.

    For RM 4.6, is it possible for training features (not the actual label) to have non-numerical value, such as nominal value?

    If so, when doing the data transform from applications, do we just need to map the non-numerical value to a numerical value, such as that done for the classification label in Figure 7.4 in rapidMiner-4.6-tutorial.pdf?

    Any comment is appreciated!
  • chenUser4321
    chenUser4321 New Altair Community Member
    I realize that in section 5.4 in rapidMiner-4.6-tutorial.pdf, it gives the learner capabilities. These contain the attribute types, such as polynominal attributes, binominal attributes, and numerical attributes. Different learners can have different capacities to support the attribute types. This seems answer my question to some degree.
  • fischer
    fischer New Altair Community Member
    Hi,

    nominal attributes have an internal mapping from nominal values to indices (getNominalMapping()). When you build up a DataRow for use in an ExampleTable, you can use this mapping to generate the indices you need. When you set values in an Example, the setValue(Attribute,String) uses this mapping automatically.

    Best,
    Simon
  • chenUser4321
    chenUser4321 New Altair Community Member
    Thank you Simon! This is very useful information.

    Daozheng