"Still want to understand what happens with RM-Nn vs. Weka-NN"
michaelhecht
New Altair Community Member
Hmmm, no one seems to be interested, but I don't give up :-)
I tried to make the problem more compact. I have a short workflow with fully prepared data (Z-Normalized).
The data are extremeöy reduced (due to upload size) but the result is comparable.
Running the workflow below results in an acceptable, not perfect 1:1 but elliptic (not a circle!!) curve if I plot
err (which is the label column) versus err(predicted). After changing Weka-MultilayerPerceptron to RM-NeuralNet
even with the same parameters results in a total confuse prediction which is neither an underfitted nor a
constant result (somtimes produced by NN) but a dataset which even has a different mean value (-0.9 instead of about 0.0).
All looks (for me) like a basic problem with RM-NN independent of actual data which I cannot work around.
I really would appreciate any suggestion.
Here is the workflow:
<?xml version="1.0" encoding="windows-1252"?>
<process version="4.3">
<operator name="Root" class="Process" expanded="yes">
<parameter key="logverbosity" value="init"/>
<operator name="MemoryCleanUp" class="MemoryCleanUp">
</operator>
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="datamanagement" value="double_array"/>
<parameter key="filename" value="C:\temp\wwexamples.csv"/>
<parameter key="label_column" value="11"/>
</operator>
<operator name="W-MultilayerPerceptron" class="W-MultilayerPerceptron">
<parameter key="G" value="true"/>
<parameter key="keep_example_set" value="true"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
</process>
[attachment deleted by admin]
I tried to make the problem more compact. I have a short workflow with fully prepared data (Z-Normalized).
The data are extremeöy reduced (due to upload size) but the result is comparable.
Running the workflow below results in an acceptable, not perfect 1:1 but elliptic (not a circle!!) curve if I plot
err (which is the label column) versus err(predicted). After changing Weka-MultilayerPerceptron to RM-NeuralNet
even with the same parameters results in a total confuse prediction which is neither an underfitted nor a
constant result (somtimes produced by NN) but a dataset which even has a different mean value (-0.9 instead of about 0.0).
All looks (for me) like a basic problem with RM-NN independent of actual data which I cannot work around.
I really would appreciate any suggestion.
Here is the workflow:
<?xml version="1.0" encoding="windows-1252"?>
<process version="4.3">
<operator name="Root" class="Process" expanded="yes">
<parameter key="logverbosity" value="init"/>
<operator name="MemoryCleanUp" class="MemoryCleanUp">
</operator>
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="datamanagement" value="double_array"/>
<parameter key="filename" value="C:\temp\wwexamples.csv"/>
<parameter key="label_column" value="11"/>
</operator>
<operator name="W-MultilayerPerceptron" class="W-MultilayerPerceptron">
<parameter key="G" value="true"/>
<parameter key="keep_example_set" value="true"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
</process>
[attachment deleted by admin]
0
Answers
-
This request is now for a few dasy in the forum.
Is there no-one who just wants to run one times the workflow with the attached data one times with Weka-NN and the next time with
RM-NeuralNet to see what is the difference or maybe to explain that I made a mistake?
I'm very sad, since it is important for me to understand the different solutions (if the RM-NN result can be called a "solution").
I think it is not too much work to do, is it?
Thank you very much in advance!!!
0 -
Perhaps you would be so good as to set the parameters for the RM Neural Net in the following code, and post same here?
<operator name="Root" class="Process" expanded="yes">
<operator name="MemoryCleanUp" class="MemoryCleanUp">
</operator>
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="filename" value="C:\Users\CJFP\Documents\rm_workspace\sample\data\nnexpl.csv"/>
<parameter key="label_column" value="11"/>
</operator>
<operator name="NeuralNet" class="NeuralNet">
<list key="hidden_layer_types">
</list>
</operator>
<operator name="W-MultilayerPerceptron" class="W-MultilayerPerceptron" activated="no">
<parameter key="keep_example_set" value="true"/>
<parameter key="G" value="true"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
0 -
Hi,
the same is true for several of the questions I asked to you in several different discussions. Did you ever bother to answer them and help us to get a better insight of what is happening? No. So please do not expect others to help you if you on the other hand are not willing to give something back.
This request is now for a few dasy in the forum.
All the best,
Ingo0 -
haddock:
sorry currently I'm home, but the example is at my office computer. I had a little bit other stress the last days so I had no time to check the forum. I will reply as soon as possible. By the way: The settings for RM-NN were default, i.e. learning=0.3, momentum=0.2 and cycles=200 (and more) and epsilon=0.05 (does this answer your question?). I modified the Weka-NN parameters to the same values and got good results. It would help me, to know whether at other systems the result of RM-NN is comparable to Weka-NN or totally unusable (as it is at my computer).
Ingo:
there was no smiley in your e-mail so you obvoiusly meant it serious. Sorry, it was not clear to me that I have to give one answer to get another back. Since all other issues I posted, were answerd quite quickly (what made me really happy), I was really wondering why no-one did react on this one.
Since I am one of the decision makers in our company (~ 6000 employees) for scientific software, I wonder if your commercial support is comparable or different. Currently we are in an evaluation phase for data mining software, thats why it is so important for me to see what the capability of different software products is. So, excuse me for a too insistent demand for answers. (You will find no smiley in this posting too)
0 -
Hello again,
sure, I was completely serious. Please consider that this is a free forum for the community of a free software. So all you can hope for is that somebody is willing to spare some of his time to help you with your problems. If this is the case: fine. If not: there is hardly anything you can do about that.
And since you asked: of course our commercial support is different (as the small overlap between commercial users and forum users might confirm - although most commercial users do not use this forum but our support tracker system). The commercial support offer anything which is not guaranteed by this forum: help and answers within a fixed time limit, guarantee for bug fixes etc. More information can be found on our web page for the Enterprise Edition at
http://rapid-i.com/content/view/123/141/
End of commercial
Just imagine our position: you get our software for free. And a couple of answers and quite an amount of our time in order to help you. But then we never got any answer back. Look's not too nice, hum?
Sorry, it was not clear to me that I have to give one answer to get another back.
Please don't get me wrong: I really like to help the free community users as much as I can but there is some border where I think: ok, now it is starting to take too much of my time to concentrate on a single user only. And this is especially true if I get the feeling that things are starting to be too unbalanced. However, let us settle this sub-discussion now and let's see if we can get to our topic, ok? And for demonstrating that I really would like to get back to a nice discussion: here is also a good-will-smiley
Ok, now back to your problem: in the meantime we found out (we were already suspecting this) that the wrong predictions mainly occurred for regression problems (we mainly tested the new version of the NN learner on classification problems). For classification problems, the NN with correct settings / preprocessing was able to adapt to the data as usual and the same is true for the new NNSimple variant. Things are, however, different on regression problems.
It took me all my free time during the last week and the weekend to come up with a new NN implementation which is independent on third party libraries and works well for both classification and regression problems. And it is much faster than the old one, too. I also added some hotfixed to the two old neural net implementations which unfortunately cannot be completely fixed without breaking (again more) compatibility to the older versions. Since they are also slower than the new one, both old neural net operators will be deprecated starting with the next update / release and will be removed in some future version.
Users of the Enterprise Edition of RapidMiner will get the new implementation "NeuralNetImproved" with the next Enterprise Edition update together with the linear regression weighting fix and some other bugfixes and some new extensions. Users of the free Community Edition will have to wait for the next major release - which will probably take some time since we are currently working on RM 5.0.
So thanks again for insisting on the NN bug for regression tasks.
Cheers,
Ingo0 -
Hello,
So finally it was helpful not only for me but also for the community and the enterprise version of RM further more You see that You can never fully separate the commercial and free branch of a software.... but there is some border where I think: ok, now it is starting to take too much of my time to concentrate on a single user only
I will wait until RM 5.0 is released (even if I never trust a x.0 version of a software ) except that we purchase the software prior to the next free release.0