"Getting started - very simple neural network training
Hi there,
I am working on the business intelligence part of a large project and looking for an appropriate tool. RapidMiner looks very promising so far though I have trouble to get a simple neural net training running. As a an initial point for experimenting, I'd like RapidMiner to learn the body mass index formula classification.
I've got two files, "training" and "testing", each containing 100 data examples. They both look like that
Now I can import the file "training" and connect it to a neural net. The neural net I connect with the output. RapidMiner trains the net and delivers the weights.
Now how do I test the weights with the test set? Help would be very much appreciated.
Kind Regards
Theo
I am working on the business intelligence part of a large project and looking for an appropriate tool. RapidMiner looks very promising so far though I have trouble to get a simple neural net training running. As a an initial point for experimenting, I'd like RapidMiner to learn the body mass index formula classification.
I've got two files, "training" and "testing", each containing 100 data examples. They both look like that
The format is: weight (kg) - height (cm) - age (y) - classification (1 = superb, 0 = ok, -1 = bad), e.g. the first line means a person weighs 69 kilo, is 189cm tall at the age of 38 which is "ok".
69 189 38 0
66 193 60 -1
63 161 59 1
74 187 36 1
68 182 37 0
63 169 46 1
75 158 30 -1
92 145 47 -1
52 160 50 0
...
Now I can import the file "training" and connect it to a neural net. The neural net I connect with the output. RapidMiner trains the net and delivers the weights.
Now how do I test the weights with the test set? Help would be very much appreciated.
Kind Regards
Theo
Find more posts tagged with
Sort by:
1 - 9 of
91
Hi Sebastian,
thank you for the help, your example was very much the same scheme I had used. Except, of course, for the data import, which seems to make the most difficulties for me now. Could you maybe take a look at my process:
Kind Regards
Theo
thank you for the help, your example was very much the same scheme I had used. Except, of course, for the data import, which seems to make the most difficulties for me now. Could you maybe take a look at my process:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>It does learn now, but is still faulty, since the learning capability of the neural net seems to heaviliy vary with the amount of testing (!!) data.
<process version="5.0">
<context>
<input>
<location/>
</input>
<output>
<location/>
<location/>
<location/>
<location/>
</output>
<macros/>
</context>
<operator activated="true" class="process" expanded="true" name="Process">
<process expanded="true" height="597" width="904">
<operator activated="true" class="read_aml" expanded="true" height="60" name="Read AML" width="90" x="36" y="81">
<parameter key="attributes" value="C:\testNN4\trainingaml.aml"/>
</operator>
<operator activated="true" class="neural_net" expanded="true" height="76" name="Neural Net" width="90" x="179" y="30">
<list key="hidden_layers">
<parameter key="null" value="7"/>
<parameter key="null" value="3"/>
</list>
<parameter key="training_cycles" value="50"/>
<parameter key="decay" value="true"/>
</operator>
<operator activated="true" class="read_aml" expanded="true" height="60" name="Read AML (2)" width="90" x="45" y="210">
<parameter key="attributes" value="C:\testNN4\testaml.aml"/>
</operator>
<operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="313" y="165">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="multiply" expanded="true" height="94" name="Multiply" width="90" x="447" y="120"/>
<operator activated="true" class="performance" expanded="true" height="76" name="Performance" width="90" x="648" y="210"/>
<connect from_op="Read AML" from_port="output" to_op="Neural Net" to_port="training set"/>
<connect from_op="Neural Net" from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_op="Neural Net" from_port="exampleSet" to_port="result 2"/>
<connect from_op="Read AML (2)" from_port="output" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Performance" to_port="labelled data"/>
<connect from_op="Multiply" from_port="output 2" to_port="result 3"/>
<connect from_op="Performance" from_port="performance" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
Kind Regards
Theo
Hi,
I finally succeeded in building a process that learns the body mass index formula. The model learned by the neural net classifies softly between +1 (healthy) and -1 (overweight). Value are e.g. "+1.037" or "-0.360".
I would like the values to be "crisp", e.g. separated in 3 classes "+1", "zero" and "-1" which symbolize the health of the patient. How do I do that?
I already tried mapping and threshold but it didnt work out. The "map" operator would not touch the "prediction (BMI)" column which is produced by the "apply model" operator. How do I employ a correct mapping?
Thanks in advance
Theo
I finally succeeded in building a process that learns the body mass index formula. The model learned by the neural net classifies softly between +1 (healthy) and -1 (overweight). Value are e.g. "+1.037" or "-0.360".
I would like the values to be "crisp", e.g. separated in 3 classes "+1", "zero" and "-1" which symbolize the health of the patient. How do I do that?
I already tried mapping and threshold but it didnt work out. The "map" operator would not touch the "prediction (BMI)" column which is produced by the "apply model" operator. How do I employ a correct mapping?
Thanks in advance
Theo
Hi Theo,
I would recommend using the discretize by user specification operator. You will have to enter the upper bound for each class and would have to select the prediction column as single attribute setting the attribute filter type to single attribute and then selecting the attribute.
Greetings,
Sebastian
I would recommend using the discretize by user specification operator. You will have to enter the upper bound for each class and would have to select the prediction column as single attribute setting the attribute filter type to single attribute and then selecting the attribute.
Greetings,
Sebastian
Good Morning,
Kind Regards
Theo
I see. Sounds like it was also the right operator for the task I finally solved by "discretize".
Sebastian Land wrote:
The threshold operators are used for applying on classification confidences to bias the classification result manually.
Sorry, I meant "operators".
Sebastian Land wrote:
What do you mean by modules?
Kind Regards
Theo
Hi Theo,
operators are fixed units inside RapidMiner. You might relative easily extend RapidMiner with your own operators solving your own problems by writing an Extension. This is done in Java and you might inherit from the existing operators there to change some of their behavior. A whitepaper for developers is on the way. (See the other many threads about this...)
Greetings,
Sebastian
operators are fixed units inside RapidMiner. You might relative easily extend RapidMiner with your own operators solving your own problems by writing an Extension. This is done in Java and you might inherit from the existing operators there to change some of their behavior. A whitepaper for developers is on the way. (See the other many threads about this...)
Greetings,
Sebastian
that's exactly what's RapidMiner was originally designed for. I will post a (senseless) example process which will make it instantly clear, how to use a test set for performance evaluation: It's a RapidMiner 5 process and I would recommend to switch to RapidMiner 5, if you are still using RapidMiner 4.x.
For getting an impression of what can be done with RapidMiner, I would recommend to go through the samples delivered with RapidMiner. Many important design patterns and most important operators are described there, together with sample processes.
Greetings,
Sebastian