Hello everybody! I have the great honor to announce the next
release of our operator toolbox extension. With this extension we add some
useful new functionality as well as build on our existing capabilities. But
without further delay, lets talk about the new things!
Detect Outliers (Univariate)
Often you want to figure out if a value is strange. RapidMiner already offers a lot of complex algorithms like LOF, CBLOF, rPCA and all the good stuff in the anomaly detection extension. .
The easy methods like a traditional z-score were not yet
embedded into single operator – So that’s what we did here! One operator to
check for odd things:

The out port contains a table with all the information you
need.

We added a column containing
an overall outlier score, which aggregates the scores of the individual
columns. Usually this is the average of the columns, but you can also get the
max or the product.
The vis ports uses an old friend of yours – The explain
predictions object! You know it from the Explain Predictions operator or from
AutoModel. In this case we use it not to visualize the influence factors for
the score, but to visualize the outlierness of every single value:

As expected an age of 0.9 is an outlier.
Lastly the operator also gives you a preprocessing model
which allows you to apply the algorithms fitted on this data set on a different
data set using the Apply Model operator.
This operator currently supports three methods to calculate
the score:
- z-Score: How many standard deviations are you away from
the mean? I.e. score = (x-mean)/std_dev
- Quartiles: How many interquartile-ranges are you away from
the median? I.e. score (x-median)/iqr where IQR is the delta of the 25th
and the 75th percentile. This is very similar to the Tukey Test
operator.
- Histogram: Use a Histgoram of the data. If the value is a
value of a unfrequent bin, than this is an outlier. This is similar to the HBOS
operator
Scan your Repository and your Processes
To manage your repository from processes we added the
ability to scan it! The List Repository Object can be pointed to any
directory and gives you all the objects in the folder structure.

This allows you to also execute every process in a folder by
combining this with Loop Values and Execute process. There are multiple other
use cases if you combine this with loops.
The Scan Processes operator allows you to go deeper into
processes. It gives you a list of all operators used in the processes of a
folder.

One use case for this is to search for deprecated operators
or operators you do not want to use anymore. Another use case is of course to
analyze your own processes using machine learning!
Have you ever wondered
who created an object? Or when? Or with what commit-id on a RM Project? The
Store (Tagged) operator gives you exactly this option! If you store an object
with this Store (Tagged) you will get all of this information as an annotation
to the object.

This works for every object, not just for tables.
Read and Write SFTP support private keys now
The Read and Write SFTP operators did support proxies and
username/password authentication from the last toolbox version. We went a step
further and add the ability to use keys for this.