Automatic optimization of outlier detection ensembles using a limited number of outlier examples

Niko Reunanen, Tomi Räty, Timo Lintonen*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

14 Citations (Scopus)

Abstract

In data analysis, outliers are deviating and unexpected observations. Outlier detection is important, because outliers can contain critical and interesting information. We propose an approach for optimizing outlier detection ensembles using a limited number of outlier examples. In our work, a limited number of outlier examples are defined as from 1 to 10% of the available outliers. The optimized outlier detection ensembles consist of outlier detection algorithms, which provide an outlier score and utilize adjustable parameters. The automatic optimization determines the parameter values, which enhance the discrimination of inliers and outliers. This increases the efficiency of the outlier detection. Outliers are rare by definition, which makes the optimization with a few examples beneficial. Obtaining examples of outliers can be prohibitively challenging, and the outlier examples should be used efficiently.
Original languageEnglish
Pages (from-to)377-394
JournalInternational Journal of Data Science and Analytics
Volume10
Issue number4
DOIs
Publication statusPublished - 1 Oct 2020
MoE publication typeA1 Journal article-refereed

Funding

Open access funding provided by Technical Research Centre of Finland (VTT).

Keywords

  • Bagging
  • Outlier detection
  • Outlier detection ensemble
  • Semi-supervised outlier detection

Fingerprint

Dive into the research topics of 'Automatic optimization of outlier detection ensembles using a limited number of outlier examples'. Together they form a unique fingerprint.

Cite this