AutoModel and Medical Data
Dear RapidMiner Friends,
Congrats for implementing the AutoModel tool which I consider a critical step for more acceptance in a #noblackboxes community as the one I am working in. This is a huge step forward! Try to understand how a physician is taking a decision based on the signs and symptoms a patient is presenting with. Additional to his/her clinical view (best translated as an optimized model based on years of clinical experience), the physician looks at new data from a patient to provide the best care at a point in time. For AI or any type of advanced analytics to be integrated in the clinical decision taking process, any new data or model needs to generate additional knowledge or wisdom in this intellectual process. The medical community is not requesting a full understanding of the algorithms used in AI, but at least the findings provided by e.g. the Automodel tool should be clarified. Therefore I would like to prepare some kind of clinical translation of the results from Automodel on a real dataset based on patients admitted to a critical care facility. The label is the survival or no survival during the ICU stay. All other attributes are related to comorbidities of each patient. I am looking forward to your conclusions on the results and it might be even more interesting to have a Skype or RingCentral meeting scheduled in the near future.
Thanks
Sven
Best Answer
-
Presentation, medical data, feature selection and generation. Simulator.
https://youtu.be/OwU_pPLLOpA
1
Answers
-
hello @SvenVanPoucke - great thoughts here abour the #noblackboxes in the medical context. I took your csv file and ran it thru the AutoModel myself just to see some quick analysis. You will see at the end I choose Decision Tree as my model for three reasons: 1) you said that it was important to understand the model as well as get good results; decision trees allow for this. 2) The performance of DT was as good as other models, and 3) the runtime was very manageable.
You will also see that the resulting decision tree is perhaps not as enlightening as expected, but at least to this non-medical person, it seems to make some sense. Basically all the disease factors are excluded from the model to maximize performance; the only factor kept is age. If you're young, we predict you do not survive, otherwise you're ok. Simple from a data science perspective, very sad from a human perspective.
My video screenshare can be found here, and the resulting process is below.
Scott<?xml version="1.0" encoding="UTF-8"?><process version="8.1.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" automodel="EXPORTED" class="process" compatibility="8.1.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="retrieve" compatibility="8.1.000" expanded="true" height="68" name="Retrieve Data" width="90" x="45" y="238">
<parameter key="repository_entry" value="//RapidMiner OneDrive/random community stuff/AutomodelMedical"/>
<description align="center" color="transparent" colored="false" width="126">Load data.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="subprocess" compatibility="8.1.000" expanded="true" height="82" name="Preprocessing" width="90" x="179" y="238">
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.000" expanded="true" height="82" name="Define Target?" width="90" x="45" y="34">
<parameter key="select_which" value="2"/>
<process expanded="true">
<connect from_port="input 1" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="set_role" compatibility="8.1.000" expanded="true" height="82" name="Define Target" width="90" x="45" y="34">
<parameter key="attribute_name" value="icustay_expire_flg"/>
<parameter key="target_role" value="label"/>
<list key="set_additional_roles"/>
<description align="center" color="transparent" colored="false" width="126">Define the target column for the predictive model.</description>
</operator>
<connect from_port="input 1" to_op="Define Target" to_port="example set input"/>
<connect from_op="Define Target" from_port="example set output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Should define a target column?</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.000" expanded="true" height="82" name="Should Discretize?" width="90" x="179" y="34">
<process expanded="true">
<connect from_port="input 1" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="discretize_by_bins" compatibility="8.1.000" expanded="true" height="103" name="Binning" width="90" x="45" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Age"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="range_name_type" value="short"/>
<description align="center" color="transparent" colored="false" width="126">Discretize by binning (same range per bin).</description>
</operator>
<connect from_port="input 1" to_op="Binning" to_port="example set input"/>
<connect from_op="Binning" from_port="example set output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="discretize_by_frequency" compatibility="8.1.000" expanded="true" height="103" name="Frequency" width="90" x="45" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Age"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="range_name_type" value="short"/>
<description align="center" color="transparent" colored="false" width="126">Discretize by frequency (same count per bin).</description>
</operator>
<connect from_port="input 1" to_op="Frequency" to_port="example set input"/>
<connect from_op="Frequency" from_port="example set output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Should discretize numerical target column?</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.000" expanded="true" height="82" name="Map Values?" width="90" x="313" y="34">
<process expanded="true">
<connect from_port="input 1" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="map" compatibility="8.1.000" expanded="true" height="82" name="Map Values" width="90" x="45" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Survived"/>
<parameter key="include_special_attributes" value="true"/>
<list key="value_mappings"/>
<description align="center" color="transparent" colored="false" width="126">Map some nominal target values to new values.</description>
</operator>
<connect from_port="input 1" to_op="Map Values" to_port="example set input"/>
<connect from_op="Map Values" from_port="example set output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Should map nominal values?</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.000" expanded="true" height="82" name="Positive Class?" width="90" x="447" y="34">
<parameter key="select_which" value="2"/>
<process expanded="true">
<connect from_port="input 1" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="nominal_to_binominal" compatibility="8.1.000" expanded="true" height="103" name="Nominal to Binominal" width="90" x="45" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="icustay_expire_flg"/>
<parameter key="include_special_attributes" value="true"/>
<description align="center" color="transparent" colored="false" width="126">Make sure that target is binary for positive class mapping.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="remap_binominals" compatibility="8.1.000" expanded="true" height="82" name="Define Positive Class" width="90" x="179" y="34">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="icustay_expire_flg"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="negative_value" value="N"/>
<parameter key="positive_value" value="Y"/>
<description align="center" color="transparent" colored="false" width="126">Potentially define which one should be the positive class.</description>
</operator>
<connect from_port="input 1" to_op="Nominal to Binominal" to_port="example set input"/>
<connect from_op="Nominal to Binominal" from_port="example set output" to_op="Define Positive Class" to_port="example set input"/>
<connect from_op="Define Positive Class" from_port="example set output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Should define positive class?</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="select_subprocess" compatibility="8.1.000" expanded="true" height="82" name="Remove Columns?" width="90" x="581" y="34">
<parameter key="select_which" value="2"/>
<process expanded="true">
<connect from_port="input 1" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="select_attributes" compatibility="8.1.000" expanded="true" height="82" name="Remove Columns" width="90" x="45" y="34">
<parameter key="attribute_filter_type" value="regular_expression"/>
<parameter key="regular_expression" value="\Qicustay_id\E"/>
<parameter key="invert_selection" value="true"/>
<parameter key="include_special_attributes" value="true"/>
<description align="center" color="transparent" colored="false" width="126">Potentially remove columns.</description>
</operator>
<connect from_port="input 1" to_op="Remove Columns" to_port="example set input"/>
<connect from_op="Remove Columns" from_port="example set output" to_port="output 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Should remove columns?</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="subprocess" compatibility="8.1.000" expanded="true" height="82" name="Unify Value Types" width="90" x="715" y="34">
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="select_attributes" compatibility="8.1.000" expanded="true" height="82" name="Remove Dates" width="90" x="45" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="value_type" value="date_time"/>
<parameter key="invert_selection" value="true"/>
<description align="center" color="transparent" colored="false" width="126">Remove all date columns.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="nominal_to_text" compatibility="8.1.000" expanded="true" height="82" name="Nominal to Text" width="90" x="179" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="include_special_attributes" value="true"/>
<description align="center" color="transparent" colored="false" width="126">Transform all nominal columns to text so that we make sure that all will have polynominal type after the next transformation.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="text_to_nominal" compatibility="8.1.000" expanded="true" height="82" name="Text to Nominal" width="90" x="313" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="include_special_attributes" value="true"/>
<description align="center" color="transparent" colored="false" width="126">Transform all text columns into polynominal columns.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="numerical_to_real" compatibility="8.1.000" expanded="true" height="82" name="Numerical to Real" width="90" x="447" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="use_value_type_exception" value="true"/>
<parameter key="except_value_type" value="integer"/>
<parameter key="include_special_attributes" value="true"/>
<description align="center" color="transparent" colored="false" width="126">Turn all numerical columns (not integers though) into real columns.</description>
</operator>
<connect from_port="in 1" to_op="Remove Dates" to_port="example set input"/>
<connect from_op="Remove Dates" from_port="example set output" to_op="Nominal to Text" to_port="example set input"/>
<connect from_op="Nominal to Text" from_port="example set output" to_op="Text to Nominal" to_port="example set input"/>
<connect from_op="Text to Nominal" from_port="example set output" to_op="Numerical to Real" to_port="example set input"/>
<connect from_op="Numerical to Real" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Unify all value types</description>
</operator>
<connect from_port="in 1" to_op="Define Target?" to_port="input 1"/>
<connect from_op="Define Target?" from_port="output 1" to_op="Should Discretize?" to_port="input 1"/>
<connect from_op="Should Discretize?" from_port="output 1" to_op="Map Values?" to_port="input 1"/>
<connect from_op="Map Values?" from_port="output 1" to_op="Positive Class?" to_port="input 1"/>
<connect from_op="Positive Class?" from_port="output 1" to_op="Remove Columns?" to_port="input 1"/>
<connect from_op="Remove Columns?" from_port="output 1" to_op="Unify Value Types" to_port="in 1"/>
<connect from_op="Unify Value Types" from_port="out 1" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">All general preprocessing steps happen inside this operator - double click on it to see the details.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="subprocess" compatibility="8.1.000" expanded="true" height="82" name="Replace Missing Values" width="90" x="313" y="238">
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="replace_missing_values" compatibility="8.1.000" expanded="true" height="103" name="Replace Nominal Missings" width="90" x="45" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="value_type" value="nominal"/>
<parameter key="default" value="value"/>
<list key="columns"/>
<parameter key="replenishment_value" value="MISSING"/>
<description align="center" color="transparent" colored="false" width="126">Replace nominal missings with the word 'missing'.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="replace_infinite_values" compatibility="8.1.000" expanded="true" height="103" name="Replace Pos Infinite Values" width="90" x="179" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="default" value="missing"/>
<list key="columns"/>
<description align="center" color="transparent" colored="false" width="126">Replace positive infinity values by missing.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="replace_infinite_values" compatibility="8.1.000" expanded="true" height="103" name="Replace Neg Infinite Values" width="90" x="313" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="include_special_attributes" value="true"/>
<parameter key="default" value="missing"/>
<list key="columns"/>
<parameter key="replenish_what" value="negative_infinity"/>
<description align="center" color="transparent" colored="false" width="126">Replace negative infinity values by missing.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="replace_missing_values" compatibility="8.1.000" expanded="true" height="103" name="Replace Numerical Missings" width="90" x="447" y="34">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="value_type" value="numeric"/>
<list key="columns"/>
<description align="center" color="transparent" colored="false" width="126">Replace numerical missings with the average of the column.</description>
</operator>
<connect from_port="in 1" to_op="Replace Nominal Missings" to_port="example set input"/>
<connect from_op="Replace Nominal Missings" from_port="example set output" to_op="Replace Pos Infinite Values" to_port="example set input"/>
<connect from_op="Replace Pos Infinite Values" from_port="example set output" to_op="Replace Neg Infinite Values" to_port="example set input"/>
<connect from_op="Replace Neg Infinite Values" from_port="example set output" to_op="Replace Numerical Missings" to_port="example set input"/>
<connect from_op="Replace Numerical Missings" from_port="example set output" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Replace missing values.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="order_attributes" compatibility="8.1.000" expanded="true" height="82" name="Reorder Attributes" width="90" x="447" y="238">
<parameter key="sort_mode" value="alphabetically"/>
<description align="center" color="transparent" colored="false" width="126">Order columns alphabetically.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="filter_examples" compatibility="8.1.000" expanded="true" height="103" name="Filter Examples" width="90" x="581" y="238">
<parameter key="condition_class" value="no_missing_labels"/>
<list key="filters_list"/>
<description align="center" color="transparent" colored="false" width="126">Model on cases with label value, apply the model on cases with a missing for the target column.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="sample_stratified" compatibility="8.1.000" expanded="true" height="82" name="Sample (Stratified)" width="90" x="715" y="136">
<parameter key="sample_size" value="250000"/>
<description align="center" color="transparent" colored="false" width="126">Sample down to 250,000 examples in case there are more.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="split_data" compatibility="8.1.000" expanded="true" height="103" name="Split Data" width="90" x="849" y="136">
<enumeration key="partitions">
<parameter key="ratio" value="0.8"/>
<parameter key="ratio" value="0.2"/>
</enumeration>
<description align="center" color="transparent" colored="false" width="126">Split of a validation set.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="multiply" compatibility="8.1.000" expanded="true" height="124" name="Multiply" width="90" x="983" y="136">
<description align="center" color="transparent" colored="false" width="126">Keep training data for simulator.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="concurrency:optimize_parameters_grid" compatibility="8.1.000" expanded="true" height="124" name="Find Optimal Depth" width="90" x="1117" y="34">
<list key="parameters">
<parameter key="Decision Tree.maximal_depth" value="2,3,5,7,10,15,25,40"/>
</list>
<parameter key="log_performance" value="false"/>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="concurrency:cross_validation" compatibility="8.1.000" expanded="true" height="145" name="Cross Validation" width="90" x="45" y="34">
<parameter key="number_of_folds" value="5"/>
<parameter key="use_local_random_seed" value="true"/>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="concurrency:parallel_decision_tree" compatibility="8.1.000" expanded="true" height="103" name="Decision Tree" width="90" x="45" y="34">
<parameter key="maximal_depth" value="40"/>
<parameter key="minimal_gain" value="0.05"/>
</operator>
<connect from_port="training set" to_op="Decision Tree" to_port="training set"/>
<connect from_op="Decision Tree" from_port="model" to_port="model"/>
<portSpacing port="source_training set" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" automodel="EXPORTED" class="apply_model" compatibility="8.1.000" expanded="true" height="82" name="Apply Model" width="90" x="45" y="34">
<list key="application_parameters"/>
</operator>
<operator activated="true" automodel="EXPORTED" class="performance_binominal_classification" compatibility="8.1.000" expanded="true" height="82" name="Performance" width="90" x="179" y="34">
<parameter key="main_criterion" value="accuracy"/>
<parameter key="classification_error" value="true"/>
<parameter key="AUC" value="true"/>
<parameter key="precision" value="true"/>
<parameter key="recall" value="true"/>
<parameter key="f_measure" value="true"/>
<parameter key="sensitivity" value="true"/>
<parameter key="specificity" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="performance 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_test set results" spacing="0"/>
<portSpacing port="sink_performance 1" spacing="0"/>
<portSpacing port="sink_performance 2" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Cross-validate the model and build final model on complete data.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="log" compatibility="8.1.000" expanded="true" height="82" name="Log Performances" width="90" x="179" y="85">
<list key="log">
<parameter key="Maximal Depth" value="operator.Decision Tree.parameter.maximal_depth"/>
<parameter key="Performance" value="operator.Cross Validation.value.performance main criterion"/>
</list>
<description align="center" color="transparent" colored="false" width="126">Log all performances for each parameter combination.</description>
</operator>
<connect from_port="input 1" to_op="Cross Validation" to_port="example set"/>
<connect from_op="Cross Validation" from_port="model" to_port="model"/>
<connect from_op="Cross Validation" from_port="performance 1" to_op="Log Performances" to_port="through 1"/>
<connect from_op="Log Performances" from_port="through 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Find optimal parameters.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="multiply" compatibility="8.1.000" expanded="true" height="103" name="Multiply (2)" width="90" x="983" y="391">
<description align="center" color="transparent" colored="false" width="126">Copy validation data.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="model_simulator:model_simulator" compatibility="8.1.000" expanded="true" height="103" name="Model Simulator" width="90" x="1251" y="85">
<description align="center" color="transparent" colored="false" width="126">Create model simulator.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="multiply" compatibility="8.1.000" expanded="true" height="124" name="Multiply (3)" width="90" x="1385" y="136">
<description align="center" color="transparent" colored="false" width="126">Copy model.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="model_simulator:explain_predictions" compatibility="8.1.000" expanded="true" height="103" name="Explain Predictions" width="90" x="1519" y="238">
<description align="center" color="transparent" colored="false" width="126">Create predictions for cases without value and add explanations for predictions.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="log_to_data" compatibility="8.1.000" expanded="true" height="82" name="Log to Data" width="90" x="1519" y="646">
<parameter key="log_name" value="Log Performances"/>
<description align="center" color="transparent" colored="false" width="126">Deliver all logged performances.</description>
</operator>
<operator activated="true" automodel="EXPORTED" class="model_simulator:lift_chart" compatibility="8.1.000" expanded="true" height="82" name="Create Lift Chart" width="90" x="1519" y="493">
<parameter key="target class" value="Y"/>
<description align="center" color="transparent" colored="false" width="126">Create lift chart.</description>
</operator>
<connect from_op="Retrieve Data" from_port="output" to_op="Preprocessing" to_port="in 1"/>
<connect from_op="Preprocessing" from_port="out 1" to_op="Replace Missing Values" to_port="in 1"/>
<connect from_op="Replace Missing Values" from_port="out 1" to_op="Reorder Attributes" to_port="example set input"/>
<connect from_op="Reorder Attributes" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Sample (Stratified)" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="unmatched example set" to_op="Explain Predictions" to_port="test data"/>
<connect from_op="Sample (Stratified)" from_port="example set output" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_op="Multiply" to_port="input"/>
<connect from_op="Split Data" from_port="partition 2" to_op="Multiply (2)" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Find Optimal Depth" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Model Simulator" to_port="training data"/>
<connect from_op="Multiply" from_port="output 3" to_op="Explain Predictions" to_port="training data"/>
<connect from_op="Find Optimal Depth" from_port="performance" to_port="result 1"/>
<connect from_op="Find Optimal Depth" from_port="model" to_op="Model Simulator" to_port="model"/>
<connect from_op="Find Optimal Depth" from_port="parameter set" to_port="result 2"/>
<connect from_op="Multiply (2)" from_port="output 1" to_op="Model Simulator" to_port="test data"/>
<connect from_op="Multiply (2)" from_port="output 2" to_op="Create Lift Chart" to_port="test data"/>
<connect from_op="Model Simulator" from_port="simulator output" to_port="result 3"/>
<connect from_op="Model Simulator" from_port="model output" to_op="Multiply (3)" to_port="input"/>
<connect from_op="Multiply (3)" from_port="output 1" to_port="result 4"/>
<connect from_op="Multiply (3)" from_port="output 2" to_op="Explain Predictions" to_port="model"/>
<connect from_op="Multiply (3)" from_port="output 3" to_op="Create Lift Chart" to_port="model"/>
<connect from_op="Explain Predictions" from_port="visualization output" to_port="result 5"/>
<connect from_op="Explain Predictions" from_port="example set output" to_port="result 6"/>
<connect from_op="Log to Data" from_port="exampleSet" to_port="result 8"/>
<connect from_op="Create Lift Chart" from_port="lift chart" to_port="result 7"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="21"/>
<portSpacing port="sink_result 4" spacing="21"/>
<portSpacing port="sink_result 5" spacing="84"/>
<portSpacing port="sink_result 6" spacing="0"/>
<portSpacing port="sink_result 7" spacing="252"/>
<portSpacing port="sink_result 8" spacing="84"/>
<portSpacing port="sink_result 9" spacing="0"/>
<description align="left" color="yellow" colored="false" height="178" resized="true" width="488" x="445" y="490">Results:<br>1. Performance from 10-fold cross validation<br>2. Optimal parameters<br>3. Model simulator<br>4. Model<br>5. Predicted data with explanations viz (only if the data had missing labels)<br>6. Predicted data with explanations table (only if the data had missing labels)<br>7. Lift chart<br>8. All performances</description>
</process>
</operator>
</process>1 -
Hi colleague
AutoModel trial is very fast and easy to build machine learning model.
The medical dataset is one of fresh examples with 5,367 patients.
The applied cases are as follows :
1. Clustering using X-Means algorithm
2. Visualize four clusters using Principal Component Analysis
3. Classification of Desiase risk using Decision Tree
The results of PDF file below.
Thank you for opportunities to study domain-specific dataset.
2 -
Presentation, medical data, feature selection and generation. Simulator.
https://youtu.be/OwU_pPLLOpA
1