🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Transition Matrix Operator: simple question

User: "chunga"
New Altair Community Member
Updated by Jocelyn
Hi!

I've just started with rapidminer and think it's amazing.  Being relatively new to data mining and machine learning, I'm starting simple, and so please forgive me if this question is naive.

I created a process (XML later) to generate some nominal data so I could try to understand the "transition matrix" operator.

The code results in the following transition matrix:

value0 0.0 0.3386693346673337 0.0
value1 0.0 0.33316658329164583 0.0
value2 0.0 0.0 0.3281640820410205

Now, I'm sure it's because I don't know what I'm looking at, but I wrote a quick perl script to calculate what I thought was the same thing, and it produced the following result (from the same example set that generated the above transition matrix):

        value0  value1  value2
value0  0.325  0.360  0.315
value1  0.366  0.297  0.336
value2  0.323  0.341  0.335

So you can see that my perl code reveals my (perhaps mis-) understanding that the rows of the transition matrix should total 1.

It's obvious to me that I don't understand the nuance in the description of the Transition Matrix operator:
This operator calculates the transition matrix of a specified attribute, i.e. the operator counts how often each possible nominal value follows after each other.
Would some kind soul please put me out of my misery and explain what it is I am seeing when I look at the output of the Transition Matrix operator?

Many thanks!
Here's the XML for the process I created:

?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.014">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.1.014" expanded="true" name="Process">
    <process expanded="true" height="325" width="145">
      <operator activated="true" class="generate_nominal_data" compatibility="5.1.014" expanded="true" height="60" name="Generate Nominal Data" width="90" x="45" y="255">
        <parameter key="number_examples" value="2000"/>
        <parameter key="number_of_attributes" value="1"/>
        <parameter key="number_of_values" value="3"/>
      </operator>
      <operator activated="true" class="write_csv" compatibility="5.1.014" expanded="true" height="60" name="Write CSV" width="90" x="151" y="254">
        <parameter key="csv_file" value="C:\Documents and Settings\MikeN\My Documents\Mike\tmat.csv"/>
        <parameter key="column_separator" value=","/>
        <parameter key="quote_nominal_values" value="false"/>
      </operator>
      <operator activated="true" class="transition_matrix" compatibility="5.1.014" expanded="true" height="76" name="Transition Matrix" width="90" x="333" y="227">
        <parameter key="attribute" value="att1"/>
      </operator>
      <connect from_op="Generate Nominal Data" from_port="output" to_op="Write CSV" to_port="input"/>
      <connect from_op="Write CSV" from_port="through" to_op="Transition Matrix" to_port="example set"/>
      <connect from_op="Transition Matrix" from_port="example set" to_port="result 1"/>
      <connect from_op="Transition Matrix" from_port="transition matrix" to_port="result 2"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
      <portSpacing port="sink_result 3" spacing="0"/>
    </process>
  </operator>
</process>
Here's my perl script:

#!/usr/bin/perl -w
use strict;
my $curr_state;
my %trans;
my %state_counts;
<>;
while(<>){
  my ($state,undef) =split /,/;
  $state_counts{$state}++;
  if($curr_state){
    $trans{$curr_state}->{$state}++;
  }
  $curr_state = $state;
}

print "\t",join("\t",(sort keys %state_counts)),"\n";
foreach $curr_state (sort keys  %trans){
  print $curr_state;
  foreach (sort keys %{$trans{$curr_state}}){
    print "\t",sprintf("%0.3f",$trans{$curr_state}->{$_}/$state_counts{$curr_state});
    #print join(",",$curr_state,$_,$trans{$curr_state}->{$_}/$state_counts{$curr_state}),"\n";
  }
  print "\n";
}

Find more posts tagged with