[Delayed] Neural net shuffles elements between clusters
T-Unit
New Altair Community Member
Hello,
i face the following problem:
I did some clustering and now i have about 1.700 data sets that belong to serveral clusters (cluster_0, cluster_1, ..., cluster_18).
I have additional 44 data sets that should by classified. To classify them a neural net should learn the 1.700 data sets above (cluser is the label-attribute). The neural net works good, so far but there is a major problem: similar elements are grouped into the same cluster, but the cluster itself seems to be the false one. To verify if this is a general problem, i told the neural net to learn from the great data set (1.700 examples) and classify the same 1.700 elements when the net was constructed.
Example:
The training data may be as followed (capital letters represent elements):
cluster_0: A, B, C, D
cluster_10: E, F, G, H
cluster_15: I, J, K, L
When the generated model of the neural net is applied on the same data that were used to train the net the results are for example:
cluster_0:E, F, G, H
cluster_10: I, J, K, L
cluster_15: A, B, C, D
... the elements are grouped together (fine!) but not into the right group (not fine!).
Anyone knows how to solve this problem?
Greetings,
Thomas
PS: I would like to post the process i use, but my message would exeed the maximum of 20000 cahrakters. Is the whole process needed or should only some parts of the process do it?
i face the following problem:
I did some clustering and now i have about 1.700 data sets that belong to serveral clusters (cluster_0, cluster_1, ..., cluster_18).
I have additional 44 data sets that should by classified. To classify them a neural net should learn the 1.700 data sets above (cluser is the label-attribute). The neural net works good, so far but there is a major problem: similar elements are grouped into the same cluster, but the cluster itself seems to be the false one. To verify if this is a general problem, i told the neural net to learn from the great data set (1.700 examples) and classify the same 1.700 elements when the net was constructed.
Example:
The training data may be as followed (capital letters represent elements):
cluster_0: A, B, C, D
cluster_10: E, F, G, H
cluster_15: I, J, K, L
When the generated model of the neural net is applied on the same data that were used to train the net the results are for example:
cluster_0:E, F, G, H
cluster_10: I, J, K, L
cluster_15: A, B, C, D
... the elements are grouped together (fine!) but not into the right group (not fine!).
Anyone knows how to solve this problem?
Greetings,
Thomas
PS: I would like to post the process i use, but my message would exeed the maximum of 20000 cahrakters. Is the whole process needed or should only some parts of the process do it?
0
Answers
-
Hi Thomas,
that is actually a bug, and I admit that this is an exceptionally nasty one. I already have an idea of the cause, so no need to post your process setup. We will include a fix in one of the next releases.
Unfortunately, I can't think of any workaround currently, so the only thing I can suggest until the bug is fixed is to try another classification algorithm.
Best regards,
Marius0 -
Unfortunately, I can't reproduce the error. Which version of RapidMiner are you using? Does this still happen in the latest version, RapidMiner 5.3? If so, I would need your process setup. Please cut your process as much as is possible while still reproducing the error before posting.
Best regards,
Marius0 -
Hello Marius,
thanks for your reply.
Because i'm running out of time with my project i tried a lot of process-modifications to yesterdays model. In fact i have now round about 30 new processes and i guess, that i unfortunately have overwritten the relevant process, but i will have a closer look to my files when there is more time to do so.
Also i will try to cut the process, because yesterdays realized "bug" was found when i uses RM 5.3.
I set the threads status to "delayed" until i had a closer look to my files.
Thanks anyway for reading and replying my questions
Greetings,
Thomas
PS: Because it was very urgent i tried the "decisiontree" instead of "neural net" for the classification of the unseen data and - as far as i see - it works fine.0 -
Seems like I have the same problem. I use the current version 3.5.
Link: http://rapid-i.com/rapidforum/index.php/topic,6231.msg21807.html#msg21807
I posted my process so probably you can reproduce this bug with this information.
By now it helps a lot to know that this is a bug. If you find a workarround, please post it. I am still interested in using the neural net classification.0