Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Auto Model Performance. Is it training, testing, or validation?
Konradlk
Find more posts tagged with
AI Studio
Performance
Auto Model
Accepted answers
varunm1
@Konradlk
Here you go. I tried a couple of neural layers with different layer sizes and adding new layers. It looks like the best performance (in my trials) came with only one layer with 2 neurons. Adding more neurons or layers is reducing the Test performance as it seems overfitting.
The process attached seemed optimal with RMSE of 0.023 and Squared Correlation of 0.5. You can try other models and compare them with a neural network to see if the RMSE is decreasing and Square correlation is increasing. Higher squared correlation and lower RMSE are better.
Below are the testing data performances (RMSE & Squared Correlation respectively)
NN with a single layer and 4 neuron Test 0.025 0.430
NN with a single layer and 10 neuron Test 0.027 0.419
NN with two-layer and 2 neurons in each layer 0.027 0.395
NN with a single layer and 2 neurons test 0.023 0.50
Hope this helps.
All comments
varunm1
Hello
@Konradlk
Auto model divides the original dataset into 60:40 split (Train: Test). The validation in the auto model is a multi hold out set validation. The model will be trained on 60% data and the 40% test data will be divided into 7 subsets. Once the model is trained, it will be used to make predictions on each of the 7 subsets independently and the performance of these 7 subsets will be averaged. So the performance you see in the auto model is from the test data using a multi hold out validation method.
Hope this helps. Please inform if you need more information.
Konradlk
Thank you so much
@varunm1
. Do you have resources to find the other errors?
varunm1
Do you have resources to find the other errors?
Can you inform what kind of resources and errors you are looking for?
If you click on "performance" of each model you can find different performance metrics like accuracy, precision, recall etc
Konradlk
@varunm1
Hi, Im looking to get the performance vector for each step of the process. So I am looking for the performance vector of Training, Validation and Testing.
I was previously using a process a coworker left me, and they explicitly said that they need errors for all 3 stages. I am sorry that this is unclear. I do not have the greatest understanding of this and trying to learn very very quickly.
My goal is to run several different prediction models and compare the performance of the different models.
This pictures down below was what i was left with. I can post more information if necessary.
lionelderkrikor
Hi
@Konradlk
,
Your process is correct.
You have effectively :
- the training performance (given by the
Performance
operator in the "training" part of the
Cross Validation
operator)
- the validation performance (given by the
Performance
operator in the "testing" part of the
Cross Validation
operator)
- the testing performance (given by the
Performance
operator in the main process)
Do you encounter some errors with this process ?
Regards,
Lionel
Konradlk
@lionelderkrikor
I do encounter errors when I try to change the neural network for deep learning or Generalized linear or SVM.
The problem I encounter is that no matter what the predictive models I run I get the exact same errors for each performance test.
When I run an auto model I get different errors for each model but not when I change them in my process. I change the models by just changing the neural network box to whatever else I wanted to run.
varunm1
I do encounter errors when I try to change the neural network for deep learning or Generalized linear or SVM.
Can you inform the details of those errors? If possible provide us with data and .rmp file to debug.
When I run an auto model I get different errors for each model but not when I change them in my process.
You might get different errors because the processes are different.
lionelderkrikor
@Konradlk
The method of validation is different in AutoModel and in your process :
- In AutoModel, a split validation with a multi hold out set validation is performed like described by Varun. You can open the process generated by AutoModel to understand how is validated your model.
- In your process, you are using a Cross Validation.
Although performance should not differ significantly in both cases, the use of 2 different validations method can explain the differences.
Moreover you are applying a preprocessing step to your data (Normalization). To my knowledge, AutoModel does not apply such preprocessing step by default. This difference in the preprocessing step can explain the difference in the performance results. Once again you can open the process generated by AutoModel and compare it to your process.
But in order we can reproduce what you observe, and find what exactly is going on, can you share your data and your process (the process of your screenshot)
Regards,
Lionel
Konradlk
@varunm1
@lionelderkrikor
Once again thank you both for your time and help. I am going to attach my .rmp file and both excel files I use. If either of you can help me figure out how to get decent data for neural network and at least one predictive model I would be so grateful.
For both of the excel files only the last sheet is used
Neural Network.rmp
Testing set-A1-C3.xlsx
Training set-ExceptA1-C3 (2).xlsx
varunm1
Hello
@Konradlk
Any reference performance values you have or you are looking for? I modified your process and added an optimize parameter grid for the neural network. I didn't change layer information inside a neural network like adding neurons or layers.
I attached the working process without errors. You can change the layers in the neural network operator inside the optimize parameter (Grid) to see how different layers work. I will try other settings, you can add layers and try as well. Use Squared correlation and RMSE as your performance evaluation metrics.
Please let us know if you have more questions
varunm1
@Konradlk
Here you go. I tried a couple of neural layers with different layer sizes and adding new layers. It looks like the best performance (in my trials) came with only one layer with 2 neurons. Adding more neurons or layers is reducing the Test performance as it seems overfitting.
The process attached seemed optimal with RMSE of 0.023 and Squared Correlation of 0.5. You can try other models and compare them with a neural network to see if the RMSE is decreasing and Square correlation is increasing. Higher squared correlation and lower RMSE are better.
Below are the testing data performances (RMSE & Squared Correlation respectively)
NN with a single layer and 4 neuron Test 0.025 0.430
NN with a single layer and 10 neuron Test 0.027 0.419
NN with two-layer and 2 neurons in each layer 0.027 0.395
NN with a single layer and 2 neurons test 0.023 0.50
Hope this helps.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups