Combining Multiple Imputation in Rapid miner
faridehbagherza
New Altair Community Member
Hi! I used R for multiple imputation and imputed 5 Imputations of my data. For the Model, I am using a stacking model of 3 base learners.
I don`t know what I should do with these imputations of the data. Should I train all my base learners with all these imputations individually?
That sounds right, but it takes a lot of time to train each of the base learners with each of the imputed data sets and then again train the stacked model with each of the imputed data sets!
Anyway, if that`s right, how can I combine the five models learned by 5 imputed data sets?
I mean, for example, to combine models for a stacking model, or addaboost or ... there are operators, but to combine models built from different imputed data sets, I couldn`t find any operator!
I don`t know what I should do with these imputations of the data. Should I train all my base learners with all these imputations individually?
That sounds right, but it takes a lot of time to train each of the base learners with each of the imputed data sets and then again train the stacked model with each of the imputed data sets!
Anyway, if that`s right, how can I combine the five models learned by 5 imputed data sets?
I mean, for example, to combine models for a stacking model, or addaboost or ... there are operators, but to combine models built from different imputed data sets, I couldn`t find any operator!
Tagged:
0
Answers
-
Here is a sample of what I was talking about:
I uploaded 2 codes on pastebin.com
1st code: http://pastebin.com/vjr8p9a7
2nd code: http://pastebin.com/Zn0aduu5
Here is a little explanation about them: 1. You need to have VIM package of R for being able to run it!
2. I upload two codes for you! In the first one I just imputed 1 dataset, and in the second one I imputed 5 datasets.
About the first code: Here, in the first Subprocess I trained 3 base learners and in the second subprocess I used these 3 learners for training a stacking model!
The stacking model has a better performance of all!
About the second code:Here in the first subprocess, I used 5 imputations to train 5 stacking models just like how I did in the first code! Then in the second subprocess I voted on these 5 models built by 5 imputations to combine the results to gain better performance!
I hope you don`t get confused with the process!
Any suggestions on the whole process would be welcomed!
I mean any other way to combine the results of the imputations instead of voting or ...!
In these processes I trained all the base learners with all the imputations, is that the common way?
Thanks in advance.
Regards
Farideh0 -
Hi,
please do not post the same topic asking for help all over the forums. If anything, it will get your help slower. Thread continues here: http://rapid-i.com/rapidforum/index.php/topic,6983.msg24400.html
Regards,
Marco0