Randomized controlled trials or data-mining??
DocMusher
New Altair Community Member
???
Hi all, my wishes for 2016!
Randomized controlled trials (RCT’s) represent the gold standard for evaluating healthcare interventions.
Is data-mining in the position to become superior to this standard???
What are the best answers for these statements?
1. Analysis of big data (collected and retrospective by nature) used for modeling (and validation) is superior to prospective RCT's based on intentionally collected data?
2. The use of control groups (RCT's) is inferior to validation techniques such as cross validation?
3. Randomization of datasets as used in RCT's is inferior to training and testing a model?
4. Blinding (double, triple blinding) is out-dated because data-mining using big data provides a better representation of the entire population?
Interested in your replies!!!
Sven
Hi all, my wishes for 2016!
Randomized controlled trials (RCT’s) represent the gold standard for evaluating healthcare interventions.
Is data-mining in the position to become superior to this standard???
What are the best answers for these statements?
1. Analysis of big data (collected and retrospective by nature) used for modeling (and validation) is superior to prospective RCT's based on intentionally collected data?
2. The use of control groups (RCT's) is inferior to validation techniques such as cross validation?
3. Randomization of datasets as used in RCT's is inferior to training and testing a model?
4. Blinding (double, triple blinding) is out-dated because data-mining using big data provides a better representation of the entire population?
Interested in your replies!!!
Sven
Tagged:
0
Answers
-
I'm not going to answer your questions but I'll refer you to a couple of videos by Susan Athey that touch on the topic. Athey is one of the most brilliant economists working on the topic in the present. I'm sure you'll find her videos & papers very useful in answering your questions.
https://www.youtube.com/watch?v=L72E08QsyMc
https://www.youtube.com/watch?v=Yx6qXM_rfKQ
https://www.google.com.ni/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&cad=rja&uact=8&ved=0ahUKEwjx4PGz5YvKAhXFRyYKHS5cBCAQFggsMAM&url=https%3A%2F%2Ffaculty-gsb.stanford.edu%2Fathey%2Fdocuments%2FHeterogeneousEffects.pdf&usg=AFQjCNHeuby6bOXpnBQuibV2h456DJyqbA&sig2=xopcedoY8h9JPM4Qg11W2w
https://faculty-gsb.stanford.edu/athey/documents/AtheyKDDfinal.pdf0 -
Thanks Carlos & Sven,
I'm having a read of once of the cited research papers now; pretty interesting. Will try and contribute to the debate once have digested.
http://arxiv.org/pdf/1504.01132v3.pdf0 -
This paper is indeed very interesting. Is the technique directly applicable using RapidMiner?
Sven0 -
It's a reworking of regression trees so it would need a bit of work with either Octave, R or Python to get it into RapidMiner
(or an extension written in Java). As far as I'm aware nobody has yet written a Regression Tree extension for RapidMiner so we would need to write that as well.
0 -
i was thinking about adding bindings for https://github.com/yinlou/mltk . This supports regression trees..0