Export Training and Testing datasets
raw160107
New Altair Community Member
Hello
I want to Export Training and Testing datasets after do split or cross validation
Into either excel or csv.
I want the training dataset after it splitted to be in independent excel/csv file.
Same for testing dataset.
(as in picture) I try to put write csv in training window and another write csv in testing window.
It don't work??
How I do that?
I want to Export Training and Testing datasets after do split or cross validation
Into either excel or csv.
I want the training dataset after it splitted to be in independent excel/csv file.
Same for testing dataset.
(as in picture) I try to put write csv in training window and another write csv in testing window.
It don't work??
How I do that?
Tagged:
1
Best Answer
-
Hello @raw160107
You don't need to connect the output ports of write CSV. You can just define the name and location where you want to store and run the process. This works fine for split validation.
In the case of cross-validation, this doesn't work as there are multiple trains and test sets based on a number of folds. One way that I do it is by using macros.
When you give the name for CSV file, you need to specify a macro %{execution_count} . This will help store training and test separately and also let you be clear on which fold they belong to.
I attached sample process .rmp file in this thread, please go through the csv file parameter of write CSV to see how I used macros. To see the attached process, you need to download it from this thread and then in rapidminer studio go to FILE --> IMPORT Process and navigate it to this file.
The train and test file names in write CSV of cross-validation are named as mentioned below. You can use any name but in the end you should add _%{execution_count}
Cross_Train_Fold_%{execution_count}
Cross_Test_Fold_%{execution_count}
Once the process is run it will create files names as Cross_Train_Fold_1.csv, Cross_Train_Fold_2, .... based on the number of folds in CV.
Please let us know if you need more information.
4
Answers
-
Hello @raw160107
You don't need to connect the output ports of write CSV. You can just define the name and location where you want to store and run the process. This works fine for split validation.
In the case of cross-validation, this doesn't work as there are multiple trains and test sets based on a number of folds. One way that I do it is by using macros.
When you give the name for CSV file, you need to specify a macro %{execution_count} . This will help store training and test separately and also let you be clear on which fold they belong to.
I attached sample process .rmp file in this thread, please go through the csv file parameter of write CSV to see how I used macros. To see the attached process, you need to download it from this thread and then in rapidminer studio go to FILE --> IMPORT Process and navigate it to this file.
The train and test file names in write CSV of cross-validation are named as mentioned below. You can use any name but in the end you should add _%{execution_count}
Cross_Train_Fold_%{execution_count}
Cross_Test_Fold_%{execution_count}
Once the process is run it will create files names as Cross_Train_Fold_1.csv, Cross_Train_Fold_2, .... based on the number of folds in CV.
Please let us know if you need more information.
4