Healthcare data warehouse
Dear RM friends,
In 2016 we published a paper in PLOSOne to demonstrate the potential of RM for the integration, use, analysis, etc of a large medical database (MIMIC-III) in a dedicated Hadoop cluster with Hive server using Radoop.
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0145791
The paper was cited several times and read by over 7000 people discovering "RapidMiner" (metrics as attachment PlosOneRM.docx).
Additionally, our research was presented at the MIT.
https://www.slideshare.net/svenvanpoucke1/rapidminer-an-entrance-to-explore-mimiciii
Based on current technology of RM and the increased demand for secondary analysis of medical records, I would love to investigate how we could use the same database as descibed in the following blog.
Interested in your feedback, my greetings from Belgium
Sven
Best Answers
-
Sven,
AWS Ressoures are usually just comsumeable in RM Studio. This means you can just connect to the S3 and Redshift ressources outlined there. Additionally one could host a RM Server Infrastrucutre on AWS to be closer to the source.
BR,
Martin1 -
Martin,
Thx 4 the feedback. Gonna activate my neuronal synapses and look if I am able to do this. Anyhow, if any community member is interested to join, I would be happy
Cheers
Sven
1 -
Hi Sven - Yes as Martin said RM works very well with AWS resources. Your project could easily look like this:
If you have resources sitting on AWS that I can play with, I'd be happy to help.
Scott1 -
Hi Scott,
I am reading through https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/
and waiting on more info from MIT how I got access to the MIMIC-III AMI with my login.
I send you my reply as soon as I got feedback from them.
I am now looking into https://s3.amazonaws.com/physionet-pds/index.html
Thanks
Sven
1 -
hi @DocMusher so I did a bit more reading and to be honest the easier way to go may be Python with the wfdb library https://github.com/MIT-LCP/wfdb-python
I put a quick sample into Execute Python and I can get a response from wfdb. It's not formatted or anything...just showing the data connection.<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="103" name="Execute Python (2)" width="90" x="45" y="34"> <parameter key="script" value="import pandas import wfdb def rm_main(): signals2, fields2 = wfdb.rdsamp('s0010_re', channels=[14, 0, 5, 10], sampfrom=100, sampto=15000, pb_dir='ptbdb/patient001/') return fields2"/> </operator> <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="34"/> <connect from_op="Execute Python (2)" from_port="output 1" to_op="Read Document" to_port="file"/> <connect from_op="Read Document" from_port="output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
I am not a Python guru by any means but there are many folks here who are if you need help...
Scott
5
Answers
-
Sven,
AWS Ressoures are usually just comsumeable in RM Studio. This means you can just connect to the S3 and Redshift ressources outlined there. Additionally one could host a RM Server Infrastrucutre on AWS to be closer to the source.
BR,
Martin1 -
Martin,
Thx 4 the feedback. Gonna activate my neuronal synapses and look if I am able to do this. Anyhow, if any community member is interested to join, I would be happy
Cheers
Sven
1 -
Hi Sven - Yes as Martin said RM works very well with AWS resources. Your project could easily look like this:
If you have resources sitting on AWS that I can play with, I'd be happy to help.
Scott1 -
Hi Scott,
I am reading through https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/
and waiting on more info from MIT how I got access to the MIMIC-III AMI with my login.
I send you my reply as soon as I got feedback from them.
I am now looking into https://s3.amazonaws.com/physionet-pds/index.html
Thanks
Sven
1 -
hi @DocMusher so I did a bit more reading and to be honest the easier way to go may be Python with the wfdb library https://github.com/MIT-LCP/wfdb-python
I put a quick sample into Execute Python and I can get a response from wfdb. It's not formatted or anything...just showing the data connection.<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="103" name="Execute Python (2)" width="90" x="45" y="34"> <parameter key="script" value="import pandas import wfdb def rm_main(): signals2, fields2 = wfdb.rdsamp('s0010_re', channels=[14, 0, 5, 10], sampfrom=100, sampto=15000, pb_dir='ptbdb/patient001/') return fields2"/> </operator> <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="34"/> <connect from_op="Execute Python (2)" from_port="output 1" to_op="Read Document" to_port="file"/> <connect from_op="Read Document" from_port="output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
I am not a Python guru by any means but there are many folks here who are if you need help...
Scott
5