Stochastic Outlier Selection in Scala

Codacy Badge Build Status Coverage Status Maven Central

Adapted version of the implementation for Apache Spark. This versions aims to perform Stochastic Outlier Selection (SOS) using Scala only, i.e., w/o the need of any Apache Spark resources.

SOS is an unsupervised outlier selection algorithm. It uses the concept of affinity to compute an outlier probability for each data point.

For more information about SOS, see the technical report: J.H.M. Janssens, F. Huszar, E.O. Postma, and H.J. van den Herik. Stochastic Outlier Selection. Technical Report TiCC TR 2012-001, Tilburg University, Tilburg, the Netherlands, 2012.

Selecting outliers from data

The current implementation accepts an Array with elements of the type Array[Double] and returns the indexes of the vector with it's degree of outlierness.

Current implementation only works with Euclidean distance.