transform example set to histogram

keyser84
keyser84 New Altair Community Member
edited November 5 in Community Q&A

Is there a possibility to work on the histogram of the data (as can be shown by the plotter)?

In particular:
- given an example set with several instances
- discretize values of one attribute in f bins
- return an example set with f attributes (corresponding to the f bins) which contains only one instance (values are instance counts of source example set)

I want to use this to get a representation of an example set and then compare the histogram against histograms of other example sets (e.g. by applying Euklidian or Manhattan distance measure to the histrogram vector).

There are operators like BinDiscretization and Aggregation, but is there another operator which performs exactly what can be shown by histogram plotter?
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    unfortunately there is no operator performing this task in one step, but it's quite easy to use a combination of Discretization and Aggregation as you already suggested. If you combine the operators in the way shown below, it should do the trick.
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="C:\Dokumente und Einstellungen\sland\Eigene Dateien\yale\workspace\sample\data\iris.aml"/>
        </operator>
        <operator name="ToHistogram" class="OperatorChain" expanded="yes">
            <operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
                <parameter key="condition_class" value="attribute_name_filter"/>
                <parameter key="attribute_name_regex" value="a1"/>
                <operator name="BinDiscretization" class="BinDiscretization">
                    <parameter key="number_of_bins" value="10"/>
                    <parameter key="range_name_type" value="short"/>
                </operator>
            </operator>
            <operator name="Aggregation" class="Aggregation">
                <list key="aggregation_attributes">
                  <parameter key="a1" value="count"/>
                </list>
                <parameter key="group_by_attributes" value="a1"/>
            </operator>
        </operator>
    </operator>
  • keyser84
    keyser84 New Altair Community Member
    Thank you... this would help.