Probabilistic consensus clustering using evidence accumulation

André Lourenço, Samuel Rota Bulò (Corresponding Author), Nicola Rebagliati, Ana L.N. Fred, Mário A.T. Figueiredo, Marcello Pelillo

Research output: Contribution to journalArticleScientificpeer-review

26 Citations (Scopus)

Abstract

Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.
Original languageEnglish
Pages (from-to)331-357
JournalMachine Learning
Volume98
Issue number1-2
DOIs
Publication statusPublished - 2013
MoE publication typeA1 Journal article-refereed

Fingerprint

Clustering algorithms
Labels
Experiments

Keywords

  • Bregman divergence
  • consensus clustering
  • ensemble clustering
  • evidence accumulation

Cite this

Lourenço, A., Rota Bulò, S., Rebagliati, N., Fred, A. L. N., Figueiredo, M. A. T., & Pelillo, M. (2013). Probabilistic consensus clustering using evidence accumulation. Machine Learning, 98(1-2), 331-357. https://doi.org/10.1007/s10994-013-5339-6
Lourenço, André ; Rota Bulò, Samuel ; Rebagliati, Nicola ; Fred, Ana L.N. ; Figueiredo, Mário A.T. ; Pelillo, Marcello. / Probabilistic consensus clustering using evidence accumulation. In: Machine Learning. 2013 ; Vol. 98, No. 1-2. pp. 331-357.
@article{27cdfc5053e1403a80438e0b072470a0,
title = "Probabilistic consensus clustering using evidence accumulation",
abstract = "Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.",
keywords = "Bregman divergence, consensus clustering, ensemble clustering, evidence accumulation",
author = "Andr{\'e} Louren{\cc}o and {Rota Bul{\`o}}, Samuel and Nicola Rebagliati and Fred, {Ana L.N.} and Figueiredo, {M{\'a}rio A.T.} and Marcello Pelillo",
year = "2013",
doi = "10.1007/s10994-013-5339-6",
language = "English",
volume = "98",
pages = "331--357",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer",
number = "1-2",

}

Lourenço, A, Rota Bulò, S, Rebagliati, N, Fred, ALN, Figueiredo, MAT & Pelillo, M 2013, 'Probabilistic consensus clustering using evidence accumulation', Machine Learning, vol. 98, no. 1-2, pp. 331-357. https://doi.org/10.1007/s10994-013-5339-6

Probabilistic consensus clustering using evidence accumulation. / Lourenço, André; Rota Bulò, Samuel (Corresponding Author); Rebagliati, Nicola; Fred, Ana L.N.; Figueiredo, Mário A.T.; Pelillo, Marcello.

In: Machine Learning, Vol. 98, No. 1-2, 2013, p. 331-357.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Probabilistic consensus clustering using evidence accumulation

AU - Lourenço, André

AU - Rota Bulò, Samuel

AU - Rebagliati, Nicola

AU - Fred, Ana L.N.

AU - Figueiredo, Mário A.T.

AU - Pelillo, Marcello

PY - 2013

Y1 - 2013

N2 - Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.

AB - Clustering ensemble methods produce a consensus partition of a set of data points by combining the results of a collection of base clustering algorithms. In the evidence accumulation clustering (EAC) paradigm, the clustering ensemble is transformed into a pairwise co-association matrix, thus avoiding the label correspondence problem, which is intrinsic to other clustering ensemble schemes. In this paper, we propose a consensus clustering approach based on the EAC paradigm, which is not limited to crisp partitions and fully exploits the nature of the co-association matrix. Our solution determines probabilistic assignments of data points to clusters by minimizing a Bregman divergence between the observed co-association frequencies and the corresponding co-occurrence probabilities expressed as functions of the unknown assignments. We additionally propose an optimization algorithm to find a solution under any double-convex Bregman divergence. Experiments on both synthetic and real benchmark data show the effectiveness of the proposed approach.

KW - Bregman divergence

KW - consensus clustering

KW - ensemble clustering

KW - evidence accumulation

U2 - 10.1007/s10994-013-5339-6

DO - 10.1007/s10994-013-5339-6

M3 - Article

VL - 98

SP - 331

EP - 357

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 1-2

ER -

Lourenço A, Rota Bulò S, Rebagliati N, Fred ALN, Figueiredo MAT, Pelillo M. Probabilistic consensus clustering using evidence accumulation. Machine Learning. 2013;98(1-2):331-357. https://doi.org/10.1007/s10994-013-5339-6