🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Healthcare data warehouse

User: "DocMusher"
New Altair Community Member
Updated by Jocelyn

Dear RM friends,

In 2016 we published a paper in PLOSOne to demonstrate the potential of RM for the integration, use, analysis, etc of a large medical database (MIMIC-III)  in a dedicated Hadoop cluster with Hive server using Radoop.

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0145791

The paper was cited several times and read by over 7000 people discovering "RapidMiner" (metrics as attachment PlosOneRM.docx).

Additionally, our research was presented at the MIT.

https://www.slideshare.net/svenvanpoucke1/rapidminer-an-entrance-to-explore-mimiciii

Based on current technology of RM and the increased demand for secondary analysis of medical records, I would love to investigate how we could use the same database as descibed in the following blog.

https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/

Interested in your feedback, my greetings from Belgium

Sven

Sort by:
1 - 6 of 61
    User: "MartinLiebig"
    Altair Employee
    Accepted Answer
    Sven,

    AWS Ressoures are usually just comsumeable in RM Studio. This means you can just connect to the S3 and Redshift ressources outlined there. Additionally one could host a RM Server Infrastrucutre on AWS to be closer to the source.

    BR,
    Martin
    User: "DocMusher"
    New Altair Community Member
    OP
    Accepted Answer
    Updated by DocMusher

    Martin,

    Thx 4 the feedback. Gonna activate my neuronal synapses and look if I am able to do this. Anyhow, if any community member is interested to join, I would be happy

    Cheers

    Sven

    User: "sgenzer"
    Altair Employee
    Accepted Answer
    Hi Sven - Yes as Martin said RM works very well with AWS resources. Your project could easily look like this:



    If you have resources sitting on AWS that I can play with, I'd be happy to help.

    Scott
    User: "DocMusher"
    New Altair Community Member
    OP
    Accepted Answer
    Updated by DocMusher
    Hi Scott,
    I am reading through https://aws.amazon.com/blogs/big-data/build-a-healthcare-data-warehouse-using-amazon-emr-amazon-redshift-aws-lambda-and-omop/
    and waiting on more info from MIT how I got access to the MIMIC-III AMI with my login.
    I send you my reply as soon as I got feedback from them. 
    I am now looking into https://s3.amazonaws.com/physionet-pds/index.html
    Thanks
    Sven

    User: "sgenzer"
    Altair Employee
    Accepted Answer
    hi @DocMusher so I did a bit more reading and to be honest the easier way to go may be Python with the wfdb library https://github.com/MIT-LCP/wfdb-python

    I put a quick sample into Execute Python and I can get a response from wfdb. It's not formatted or anything...just showing the data connection.

    <?xml version="1.0" encoding="UTF-8"?><process version="9.0.003">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.0.003" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="python_scripting:execute_python" compatibility="8.2.000" expanded="true" height="103" name="Execute Python (2)" width="90" x="45" y="34">
            <parameter key="script" value="import pandas&#10;import wfdb&#10;&#10;def rm_main():&#10;&#10;    signals2, fields2 = wfdb.rdsamp('s0010_re', channels=[14, 0, 5, 10], sampfrom=100, sampto=15000, pb_dir='ptbdb/patient001/')&#10;&#10;    return fields2"/>
          </operator>
          <operator activated="true" class="text:read_document" compatibility="8.1.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="34"/>
          <connect from_op="Execute Python (2)" from_port="output 1" to_op="Read Document" to_port="file"/>
          <connect from_op="Read Document" from_port="output" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    

    I am not a Python guru by any means but there are many folks here who are if you need help... :wink:

    Scott

    User: "Nikouy"
    New Altair Community Member

    Do you know how does Rapidminer interact with Redhisft or Azure data lakes? Is the data downloaded and loaded onto RAM for its analysis?

    Thanks