Healthcare data warehouse

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0145791

Updated Nov 5, 2024 by Jocelyn

Dear RM friends,

In 2016 we published a paper in PLOSOne to demonstrate the potential of RM for the integration, use, analysis, etc of a large medical database (MIMIC-III) in a dedicated Hadoop cluster with Hive server using Radoop.

The paper was cited several times and read by over 7000 people discovering "RapidMiner" (metrics as attachment PlosOneRM.docx).

Additionally, our research was presented at the MIT.

https://www.slideshare.net/svenvanpoucke1/rapidminer-an-entrance-to-explore-mimiciii

Based on current technology of RM and the increased demand for secondary analysis of medical records, I would love to investigate how we could use the same database as descibed in the following blog.

https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/

Interested in your feedback, my greetings from Belgium

Sven

Find more posts tagged with

AI Studio

AI Studio Radoop

Hadoop

Healthcare-BioMed-Pharma

AWS Azure

Sort by:

1 - 6 of 61

MartinLiebig

Altair Employee

Accepted Answer

Sven,

AWS Ressoures are usually just comsumeable in RM Studio. This means you can just connect to the S3 and Redshift ressources outlined there. Additionally one could host a RM Server Infrastrucutre on AWS to be closer to the source.

BR,
Martin

New Altair Community Member

Accepted Answer

Updated Dec 7, 2018 by DocMusher

Martin,

Thx 4 the feedback. Gonna activate my neuronal synapses and look if I am able to do this. Anyhow, if any community member is interested to join, I would be happy

Cheers

Sven

Altair Employee

Accepted Answer

Hi Sven - Yes as Martin said RM works very well with AWS resources. Your project could easily look like this:

Image: https://us.v-cdn.net/6030995/uploads/editor/oj/dwuemec5ahju.png

If you have resources sitting on AWS that I can play with, I'd be happy to help.

Scott

New Altair Community Member

Accepted Answer

Updated Dec 10, 2018 by DocMusher

Hi Scott,
I am reading through https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/
and waiting on more info from MIT how I got access to the MIMIC-III AMI with my login.
I send you my reply as soon as I got feedback from them.
I am now looking into https://s3.amazonaws.com/physionet-pds/index.html
Thanks
Sven

Altair Employee

Accepted Answer

hi @DocMusher so I did a bit more reading and to be honest the easier way to go may be Python with the wfdb library https://github.com/MIT-LCP/wfdb-python

I put a quick sample into Execute Python and I can get a response from wfdb. It's not formatted or anything...just showing the data connection.

<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="103" name="Execute Python (2)" width="90" x="45" y="34">
        <parameter key="script" value="import pandas&#10;import wfdb&#10;&#10;def rm_main():&#10;&#10;    signals2, fields2 = wfdb.rdsamp('s0010_re', channels=[14, 0, 5, 10], sampfrom=100, sampto=15000, pb_dir='ptbdb/patient001/')&#10;&#10;    return fields2"/>
      </operator>
      <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="34"/>
      <connect from_op="Execute Python (2)" from_port="output 1" to_op="Read Document" to_port="file"/>
      <connect from_op="Read Document" from_port="output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

I am not a Python guru by any means but there are many folks here who are if you need help...

Scott

Nikouy

New Altair Community Member

Nov 15, 2019

Hi @mschmitz ,

Do you know how does Rapidminer interact with Redhisft or Azure data lakes? Is the data downloaded and loaded onto RAM for its analysis?

Thanks

Sort by:

1 - 5 of 51

MartinLiebig

Altair Employee

Accepted Answer

New Altair Community Member

Accepted Answer

Updated Dec 7, 2018 by DocMusher

Martin,

Thx 4 the feedback. Gonna activate my neuronal synapses and look if I am able to do this. Anyhow, if any community member is interested to join, I would be happy

Cheers

Sven

Altair Employee

Accepted Answer

Hi Sven - Yes as Martin said RM works very well with AWS resources. Your project could easily look like this:

If you have resources sitting on AWS that I can play with, I'd be happy to help.

Scott

New Altair Community Member

Accepted Answer

Updated Dec 10, 2018 by DocMusher

Altair Employee

Accepted Answer

<?xml version="1.0" encoding="UTF-8"?><process version="9.0.003">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process">
    <process expanded="true">
      <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="103" name="Execute Python (2)" width="90" x="45" y="34">
        <parameter key="script" value="import pandas&#10;import wfdb&#10;&#10;def rm_main():&#10;&#10;    signals2, fields2 = wfdb.rdsamp('s0010_re', channels=[14, 0, 5, 10], sampfrom=100, sampto=15000, pb_dir='ptbdb/patient001/')&#10;&#10;    return fields2"/>
      </operator>
      <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="34"/>
      <connect from_op="Execute Python (2)" from_port="output 1" to_op="Read Document" to_port="file"/>
      <connect from_op="Read Document" from_port="output" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>

I am not a Python guru by any means but there are many folks here who are if you need help...

Scott