🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Retrieve KNN Distance Results"

User: "michaelgloven"
New Altair Community Member
Updated by Jocelyn

Hi, is there an operator to extract distance results from application of KNN lazy learner to labeled and scored data? I would like to see the underlying data driving the predictions.

Find more posts tagged with

Sort by:
1 - 3 of 31
    User: "kypexin"
    New Altair Community Member

    Hi @michaelgloven

    Not really sure 100% in my guessing, but maybe 'Cross Distances' operator might help you in this case?

    I have never used it myself on real data but it seems it has same distance measures as k-NN does. 

    User: "JEdward"
    New Altair Community Member

    Which distance are you looking for?  The distances to the k nearest neighbors themselves would be fine for a k of 1 to 3, but will look pretty messy when you reach k=50+.

     

    Here's a sample process, personally I'm not too keen. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.2.000" expanded="true" height="68" name="Retrieve Iris" width="90" x="45" y="136">
    <parameter key="repository_entry" value="//Samples/data/Iris"/>
    </operator>
    <operator activated="true" class="split_data" compatibility="8.2.000" expanded="true" height="103" name="Split Data" width="90" x="179" y="136">
    <enumeration key="partitions">
    <parameter key="ratio" value="0.7"/>
    <parameter key="ratio" value="0.3"/>
    </enumeration>
    </operator>
    <operator activated="true" class="k_nn" compatibility="8.2.000" expanded="true" height="82" name="k-NN" width="90" x="313" y="238">
    <parameter key="k" value="3"/>
    </operator>
    <operator activated="true" class="cross_distances" compatibility="8.2.000" expanded="true" height="103" name="Cross Distances" width="90" x="380" y="85">
    <parameter key="only_top_k" value="true"/>
    <parameter key="k" value="3"/>
    </operator>
    <operator activated="true" class="aggregate" compatibility="8.2.000" expanded="true" height="82" name="Aggregate" width="90" x="514" y="34">
    <list key="aggregation_attributes">
    <parameter key="distance" value="average"/>
    </list>
    <parameter key="group_by_attributes" value="request"/>
    </operator>
    <operator activated="true" class="apply_model" compatibility="8.2.000" expanded="true" height="82" name="Apply Model" width="90" x="581" y="289">
    <list key="application_parameters"/>
    </operator>
    <operator activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join" width="90" x="648" y="34">
    <parameter key="use_id_attribute_as_key" value="false"/>
    <list key="key_attributes">
    <parameter key="id" value="request"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.2.000" expanded="true" height="82" name="Set Role" width="90" x="715" y="136">
    <parameter key="attribute_name" value="average(distance)"/>
    <parameter key="target_role" value="distance_measure"/>
    <list key="set_additional_roles"/>
    </operator>
    <connect from_op="Retrieve Iris" from_port="output" to_op="Split Data" to_port="example set"/>
    <connect from_op="Split Data" from_port="partition 1" to_op="k-NN" to_port="training set"/>
    <connect from_op="Split Data" from_port="partition 2" to_op="Cross Distances" to_port="request set"/>
    <connect from_op="k-NN" from_port="model" to_op="Apply Model" to_port="model"/>
    <connect from_op="k-NN" from_port="exampleSet" to_op="Cross Distances" to_port="reference set"/>
    <connect from_op="Cross Distances" from_port="result set" to_op="Aggregate" to_port="example set input"/>
    <connect from_op="Cross Distances" from_port="request set" to_op="Apply Model" to_port="unlabelled data"/>
    <connect from_op="Aggregate" from_port="example set output" to_op="Join" to_port="right"/>
    <connect from_op="Apply Model" from_port="labelled data" to_op="Join" to_port="left"/>
    <connect from_op="Join" from_port="join" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

     
    User: "michaelgloven"
    New Altair Community Member
    OP
    Accepted Answer

    good ideas,  looks like I can get what I'm looking for thru data to similarity operator. Also, the graph outputs (spring) are especially helpful in visualizing the KNN method.