"Calculating PCA scores"
frankie
New Altair Community Member
Hi,
Looking at example #13 in the RM help. How does one calculate the PCA scores based the dataset values and selected components?
Looking at example #13 in the RM help. How does one calculate the PCA scores based the dataset values and selected components?
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.000" expanded="true" name="Root">
<description>The calculation of principal components is often used as a feature transforming preprocessing step. It can reduce the dimensionality of the data set at hand while the major data variance is preserved. Perform the process and check out the plot view of the Iris data set loaded and transformed by this process.</description>
<process expanded="true" height="494" width="433">
<operator activated="true" class="retrieve" compatibility="5.0.000" expanded="true" height="60" name="Retrieve" width="90" x="45" y="30">
<parameter key="repository_entry" value="../../data/Iris"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.0.000" expanded="true" height="94" name="Normalization" width="90" x="180" y="30"/>
<operator activated="true" class="principal_component_analysis" compatibility="5.0.000" expanded="true" height="94" name="PrincipalComponents" width="90" x="313" y="30"/>
<connect from_op="Retrieve" from_port="output" to_op="Normalization" to_port="example set input"/>
<connect from_op="Normalization" from_port="example set output" to_op="PrincipalComponents" to_port="example set input"/>
<connect from_op="PrincipalComponents" from_port="example set output" to_port="result 1"/>
<connect from_op="PrincipalComponents" from_port="preprocessing model" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="18"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
0
Answers
-
PCA is considered a "model". To get the predictions or scores, you "apply" the model to the example set.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.008" expanded="true" name="Root">
<description>The calculation of principal components is often used as a feature transforming preprocessing step. It can reduce the dimensionality of the data set at hand while the major data variance is preserved. Perform the process and check out the plot view of the Iris data set loaded and transformed by this process.</description>
<process expanded="true" height="494" width="748">
<operator activated="true" class="retrieve" compatibility="5.1.008" expanded="true" height="60" name="Retrieve (2)" width="90" x="45" y="165">
<parameter key="repository_entry" value="//Samples/data/Iris"/>
</operator>
<operator activated="true" class="normalize" compatibility="5.1.008" expanded="true" height="94" name="Normalization" width="90" x="246" y="30"/>
<operator activated="true" class="principal_component_analysis" compatibility="5.1.008" expanded="true" height="94" name="PrincipalComponents" width="90" x="447" y="30">
<parameter key="dimensionality_reduction" value="fixed number"/>
</operator>
<operator activated="true" class="apply_model" compatibility="5.1.008" expanded="true" height="76" name="Apply Model" width="90" x="648" y="75">
<list key="application_parameters"/>
</operator>
<connect from_op="Retrieve (2)" from_port="output" to_op="Normalization" to_port="example set input"/>
<connect from_op="Normalization" from_port="example set output" to_op="PrincipalComponents" to_port="example set input"/>
<connect from_op="PrincipalComponents" from_port="original" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="PrincipalComponents" from_port="preprocessing model" to_op="Apply Model" to_port="model"/>
<connect from_op="Apply Model" from_port="labelled data" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="18"/>
</process>
</operator>
</process>0