-
How to a use Auto Model for data that I have already split into train and test?
I am trying to solve an imbalanced binary classification problem using a model to predict the minority class (stroke victims). I used oversampling on the training data to make synthetic instances of stroke cases so that I could address the data imbalance issue. However, I kept the test data as its normal imbalanced…
-
Auto model validation
From what I have read auto model splits the data into 60:40% and then splits the 40 into 7 subsets and performs the scoring on those and produces the performance as average. Is the hyperparamter tuning or validation done using the 40%? If so how can this 40 then be used for scoring when the validation was already done on…
-
Scoring fails
Hi everyone: I get several models from Auto model, next I want validate these models with another dataset. The new dataset have the same structure (colums, domains and value ranges). I'm using Apply Model with the model from auto model and the new dataset. The problem is what the prediction results are only one class, no…
-
Model Questions
Hi, everyone. After I learned from the resources, I have a few questions. 1. Why "Apply Preprocessing" is added? What is the function? 2. Why "Group Models" is added here? Why "pre" connects to "mod"? Thanks for giving answers.
-
Consecutive failure using ANN on test validation
Hello everyone, how could i find the consecutive failure on validation test when i use ANN. Which variable represent this value ? Thanks a lot
-
External Validation
I did the following process and got good performance results by cross validation. Now I want to run an extern data set on this very same model. How to do so? The retriew valdays_complete thereby is the external set, Filter examples (2) selects the dementia subgroup (also the used subgroup for modelling).
-
Valdiation of the model and adjusted R Squared
Hey Community, I have a question regarding the validation of my model (I used the cross validation operator). I created a prediction model (label: numeric) and therfore used the algorithms "Linear Regression, "Neural Net" and "Deep Learning". For validation I chose the RMSE, the relative error and the squared correlation…
-
How can I specify a validation dataset for H2o Deep Learning model in RapidMiner
Hello, I'm using the Deep Learning model of the H2o framework available in RapidMiner. To perform my analyses I don't see how to control the data used for validation step at the end of each epoch. For example, with Keras you have to specify the training set rate, and that part of your data are used for the validation. With…
-
Split-Validation Issue
@sgenzer, I believe there may be some issue with the split-validation operator. The model output through the entire split-validation process does not correspond to the model with which the validation performance metrics are computed. I have attached an Excel spreadsheet to show the computations with a formula. The RMSE…
-
How to create a QQ plot ?
Hi there, is there a way to create a QQ plot in RM? Best regards
-
Export Training and Testing datasets
Hello I want to Export Training and Testing datasets after do split or cross validation Into either excel or csv. I want the training dataset after it splitted to be in independent excel/csv file. Same for testing dataset. (as in picture) I try to put write csv in training window and another write csv in testing window. It…
-
Example set has no nominal label: using shuffled partition instead of stratified partition!
I am getting this error while running my model as a warning actually and I tried to change data type of one field from real to nominal but this warning still didn't go away. I am using GBT for prediction for my data set. Can you tell me how to resolve this issue. thank you
-
How to reduce RMSE/SE when it's too high
Hi All, My data has 2 integers and all other polynomial attributes id state year month leads (int) responses (int) typeOfMail status I used split model where I split my data between 20 and 2 months for 22 months and I got 12.41 RSME and squared_error: 154.176 +/- 335.663. I don't know how to reduce this and also not sure…
-
Batch ID Generation
Hello, Is there a simple way to generate batch id (Divide samples into 5 groups) based on ID column in a dataset. For example, I have a dataset with 400 samples related to 30 subjects (Multiple samples per subject). I would like to divide the data set into 5 (this can be any value) batches based on the Subject (not…
-
Split Validation within Backward Elimination
Hello together, I am currently executing a Backward Elimination with contains a simple Naive Bayes by using Split Validation. But the subprocess is iterativeley working by switching between the milestones Naive Bayes --> Apply Model --> Performance and subsequently again starts with Naive Bayes (see attached). Is this…
-
Model validation performance
Hello together, which validation performance (with regard to learning and testing phase) of classification models is quicker? Cross-calidation or the classical split validation (with a 70:30 split)? Thank you in advance for your help! Best regard, Fatih
-
Manual inspection of missclassified examples
Hello, I'm trying to find out how, after training a classification model, I can look at the examples that were incorrectly classified. For now I can only see how many examples were incorrectly classified in the confusion matrix, but I want to inspect the missclassified examples manually. Since evaluation vector does not…
-
Automodel learn/test
Is there a good reason to split the data in 60/20/20% where the last 20% is used to "test the testing of the conclusion", as proposed in another platform?
-
How to use ARIMA with Forecast Validation and Optimize Parameters Operator?
Hi, I want to build up a Salesforecast with an ARIMA Model. Therefore I would like to train and test my model and additional I would like to find the best values for p, q and d. Can someone help me how to include the ARIMA Model, Forecast Validation and Optimize Parameters Operator into each other? Thank you in advance for…
-
Walking Forward Testing
Hello, I built a multivariate regression forecasting using NN. Results seems to be ok so far. However and since I'm forecasting the next value (+1) using all past values I would like to be able to test the model in a walk forward way, i.e. using past values to predict next in a rolling way till last example. I thought…
-
Cross Validation with Smote Upsampling
Hi all, I see that there are already some discussions in this community about this subject. However I still have some doubts. I have a process, in which there is a class imbalance and the minority class is the most important. SMOTE upsampling seems to provide good results. I say "seems" because I have doubts on how to…