Healthcare data warehouse
Dear RM friends,
In 2016 we published a paper in PLOSOne to demonstrate the potential of RM for the integration, use, analysis, etc of a large medical database (MIMIC-III) in a dedicated Hadoop cluster with Hive server using Radoop.
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0145791
The paper was cited several times and read by over 7000 people discovering "RapidMiner" (metrics as attachment PlosOneRM.docx).
Additionally, our research was presented at the MIT.
https://www.slideshare.net/svenvanpoucke1/rapidminer-an-entrance-to-explore-mimiciii
Based on current technology of RM and the increased demand for secondary analysis of medical records, I would love to investigate how we could use the same database as descibed in the following blog.
Interested in your feedback, my greetings from Belgium
Sven
Find more posts tagged with
I am reading through https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/
and waiting on more info from MIT how I got access to the MIMIC-III AMI with my login.
I send you my reply as soon as I got feedback from them.
I am now looking into https://s3.amazonaws.com/physionet-pds/index.html
Thanks
Sven
I put a quick sample into Execute Python and I can get a response from wfdb. It's not formatted or anything...just showing the data connection.
<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="103" name="Execute Python (2)" width="90" x="45" y="34"> <parameter key="script" value="import pandas import wfdb def rm_main(): signals2, fields2 = wfdb.rdsamp('s0010_re', channels=[14, 0, 5, 10], sampfrom=100, sampto=15000, pb_dir='ptbdb/patient001/') return fields2"/> </operator> <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="34"/> <connect from_op="Execute Python (2)" from_port="output 1" to_op="Read Document" to_port="file"/> <connect from_op="Read Document" from_port="output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>

Scott
AWS Ressoures are usually just comsumeable in RM Studio. This means you can just connect to the S3 and Redshift ressources outlined there. Additionally one could host a RM Server Infrastrucutre on AWS to be closer to the source.
BR,
Martin
Martin,
Thx 4 the feedback. Gonna activate my neuronal synapses and look if I am able to do this. Anyhow, if any community member is interested to join, I would be happy
Cheers
Sven

If you have resources sitting on AWS that I can play with, I'd be happy to help.
Scott
I am reading through https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/
and waiting on more info from MIT how I got access to the MIMIC-III AMI with my login.
I send you my reply as soon as I got feedback from them.
I am now looking into https://s3.amazonaws.com/physionet-pds/index.html
Thanks
Sven
I put a quick sample into Execute Python and I can get a response from wfdb. It's not formatted or anything...just showing the data connection.
<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="103" name="Execute Python (2)" width="90" x="45" y="34"> <parameter key="script" value="import pandas import wfdb def rm_main(): signals2, fields2 = wfdb.rdsamp('s0010_re', channels=[14, 0, 5, 10], sampfrom=100, sampto=15000, pb_dir='ptbdb/patient001/') return fields2"/> </operator> <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="34"/> <connect from_op="Execute Python (2)" from_port="output 1" to_op="Read Document" to_port="file"/> <connect from_op="Read Document" from_port="output" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>

Scott
AWS Ressoures are usually just comsumeable in RM Studio. This means you can just connect to the S3 and Redshift ressources outlined there. Additionally one could host a RM Server Infrastrucutre on AWS to be closer to the source.
BR,
Martin