Apply Neural Net to Unlabelled Data

New Altair Community Member

Jul 31, 2017

Updated Nov 5, 2024 by Jocelyn

Hi everyone,

first of all I'm pretty new to Rapidminer. I'm a student and working with the tool for educational purposes.

I built a model for predicting a binominal outcome using a neural network. I have a training dataset and one unlabelled for application. Both datasets have the same structure. I managed to train my neural network in a cross validation operator and measure the performance on the training data and I can also apply this to my application data. And after applying the model to the application data 3 new columns are created (predicted(outcome), confidence yes/no), but I'm not sure if I'm doing this right... I can't use another performance operator after the application of the model, because it would require a labelled input. Is there another way to get the same performance vector matrix as for the training data in order to check accuracy, precision and recall for the new application data or would this require a labelled data set? How can I check the performance of my model nonetheless?

I'd really appreciate your help!

Greets John

Find more posts tagged with

AI Studio

Getting Started

Deep Learning + Neural Nets

Sort by:

1 - 7 of 71

Thomas_Ott

New Altair Community Member

Aug 1, 2017

You can output the PER port from the Validation operator to get the performance measures.

JEdward

New Altair Community Member

Accepted Answer

Aug 1, 2017

Hi John,

Actually it sounds like you're doing it right, what I tend to do is before building the model, split my data into a Training set & a Test set.
So you have 3 datasets:

Training
- You can train your model on this dataset, in this step I perform a Cross Validation to see how the model 'should' perform in practice.
Testing
- Here you have a dataset which you can apply the model on, but it ALSO has the historic labels. So you can use this to measure the performance of your model. I can test many models together and select the best performing ones using the Compare Models operator.
Unlabelled (Scoring)
- Once you have tested your model and are pretty confident in it, then you can apply it to your data and can 'trust' that the performance should be close to your testing.

Try this now and look at the results. Great right!

However, how can you be really sure you can 'trust' your model? You've only tested it once, maybe it just got 'lucky' and in reality it's not going to perform as expected.

There's various ways to ensure you can trust your tested model performance, so after you've tried out the Split Validation I'd like you to read this series of 4 blog posts by @IngoRM and download the Repository with sample processes.

Let us know here how you get on!

Learn the Right Way to Validate Models

John25

New Altair Community Member

Aug 1, 2017

Thanks a lot for your help!

It made things a lot clearer for me and guided me to a functioning model. Now I'm starting to improve the various parameters..

abbasi_samira

New Altair Community Member

Jan 8, 2018

How can I use the neural network for degree of cancer spread?

thanks

lionelderkrikor

New Altair Community Member

Jan 8, 2018

Hi @abbasi_samira,

Can you share your dataset(s) and your process (if you have one), please ?

Regards,

Lionel

abbasi_samira

New Altair Community Member

Jan 8, 2018

Hello
Unfortunately, due to security, I can not share data
But I can explain more fully about it

lionelderkrikor

New Altair Community Member

Jan 8, 2018

Hi again @abbasi_samira,

OK, I understand.

Yes you can can explain more fully and maybe share a "fictive example set" inspired

from your data (to know the size and structure of your data).

Regards,

Lionel

Sort by:

1 - 1 of 11

JEdward

New Altair Community Member

Accepted Answer

Aug 1, 2017

Hi John,

Actually it sounds like you're doing it right, what I tend to do is before building the model, split my data into a Training set & a Test set.
So you have 3 datasets:

Training
- You can train your model on this dataset, in this step I perform a Cross Validation to see how the model 'should' perform in practice.
Testing
- Here you have a dataset which you can apply the model on, but it ALSO has the historic labels. So you can use this to measure the performance of your model. I can test many models together and select the best performing ones using the Compare Models operator.
Unlabelled (Scoring)
- Once you have tested your model and are pretty confident in it, then you can apply it to your data and can 'trust' that the performance should be close to your testing.

Try this now and look at the results. Great right!

However, how can you be really sure you can 'trust' your model? You've only tested it once, maybe it just got 'lucky' and in reality it's not going to perform as expected.

Let us know here how you get on!

Learn the Right Way to Validate Models

View in context

Apply Neural Net to Unlabelled Data

Find more posts tagged with

Quick Links