OutlierDistanceBasedDetection
alexman
New Altair Community Member
Hi,
I have an exampleSet with different attributes and I would like to apply Outlier in every numerical attribute but separately is this possible? which is the best way to do ? I can apply outlier over the whole table but i would like to do it in every attribute, for example:
heigth weigth ..... // more attributtes
188 80
185 150
186 83
189 89
190 87
192 86
145 88
I would like to get 145 (heigth) and 150 (weigth) separately ... [Probably a process for each attribute applying DBoutlierOperator would be a solution but not efficient...]
DBOutlierOperator(OperatorDescription description) is not applyable for an attribute of an exampleSet. Probably AttributeSelectionExampleSet which filters what attributes I want in exampleset would be useful but how to apply the Outlier function for each attribute?
thanks
I have an exampleSet with different attributes and I would like to apply Outlier in every numerical attribute but separately is this possible? which is the best way to do ? I can apply outlier over the whole table but i would like to do it in every attribute, for example:
heigth weigth ..... // more attributtes
188 80
185 150
186 83
189 89
190 87
192 86
145 88
I would like to get 145 (heigth) and 150 (weigth) separately ... [Probably a process for each attribute applying DBoutlierOperator would be a solution but not efficient...]
DBOutlierOperator(OperatorDescription description) is not applyable for an attribute of an exampleSet. Probably AttributeSelectionExampleSet which filters what attributes I want in exampleset would be useful but how to apply the Outlier function for each attribute?
thanks
Tagged:
0
Answers
-
Hi,
you could use a combination of the feature iterator and the Attribute Subset Preprocessing, which will deliver only a subset of the exampleset's attributes to its child operators. Since this is a complex process, I will post a sample below:
<operator name="Root" class="Process" expanded="yes">
<operator name="ExampleSetGenerator" class="ExampleSetGenerator">
<parameter key="target_function" value="sum classification"/>
</operator>
<operator name="IOStorer" class="IOStorer">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
<parameter key="remove_from_process" value="false"/>
</operator>
<operator name="FeatureIterator" class="FeatureIterator" expanded="yes">
<parameter key="work_on_input" value="false"/>
<operator name="IORetriever" class="IORetriever">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="parameter_string" value="att5"/>
<parameter key="attribute_name_regex" value="%{loop_feature}"/>
<operator name="DetectionOnSingleAttribute" class="DensityBasedOutlierDetection">
<parameter key="distance" value="1.0"/>
<parameter key="proportion" value="0.5"/>
</operator>
<operator name="DoingSomething" class="OperatorChain" expanded="yes">
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="Outlier"/>
</operator>
<operator name="ChangeAttributeName" class="ChangeAttributeName">
<parameter key="old_name" value="Outlier"/>
<parameter key="new_name" value="Outlier_%{loop_feature}"/>
</operator>
</operator>
</operator>
<operator name="IOStorer (2)" class="IOStorer">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
<parameter key="remove_from_process" value="false"/>
</operator>
</operator>
<operator name="IOConsumer" class="IOConsumer">
<parameter key="io_object" value="ExampleSet"/>
</operator>
<operator name="IORetriever (2)" class="IORetriever">
<parameter key="name" value="Store"/>
<parameter key="io_object" value="ExampleSet"/>
</operator>
</operator>
The problem is the behavior of the FeatureIterator, which will not deliver the changed exampleset after finishing. That's why we have to use the IOStore and IORetrieve operators to save the generated ExampleSet on our own. We actually only need the macro defined by the FeatureIterator giving us every regular attribute name, so that we can use it in the attributeSubsetPreprocessing condition.
This sample only renames the attributes, but you very well might do something more intelligent like unification of the results of each attribute using an attributeConstruction, or something else.
Greetings,
Sebastian0 -
ok,
now I have detected outliers but now I need to get the individual results.
I have a table but i have to select atributes values where otliers is true.
An sql statement would "select from table where outlier_vel=true" but what I have are the reults from the process (are in a exampleset) and I cannot make a query like sql...
which is the best way to query in the exampleset results (applying filters)?
thanks a lot!0 -
Hi,
did you try the ExampleFilter? It allows several conditions for filtering examples from the set.
Greetings,
Sebastian0 -
Hi Sebastian,
tackling a similiar problem. Tried the "Filter Examples" operator of Version 5.0.
No matter how I set the parameter string (outlier=false / outlier=true), the result set is empty.
<operator activated="true" class="filter_examples" expanded="true" height="76" name="Filter Examples" width="90" x="313" y="435">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="outlier=false"/>
</operator>
Maybe some syntax problem?
Greetings,
Chris0 -
G'Day Chris,
LOF produces a number, rather than a boolean - r-Click on operator, then F1->Description produces ...
.Afterwards LOFs are added as values for a special real-valued outlier attribute in the example set which the operator will return
0 -
G'day haddock,
thanks for the hint. It works with LOF (filtering for "outlier < 1").
So the question is: what is the correct syntax with boolean filter parameters?
Cheers,
Chris0 -
Checked the example "processes->02_preprocessing_18_OutlierDetection".
There is also a filter deployed filtering for "outlier=false". And it works.
So I rebuilt my workflow one more time. And now it works. Cannot identiy any differences...
Probably a case where the problem is in front of the screen one more time.
Thanks for your help anyway!0