Matching samples of multiple views

Abhishek Tripathi (Corresponding Author), Arto Klami, Matej Orešič, Samuel Kaski

Research output: Contribution to journalArticleScientificpeer-review

10 Citations (Scopus)

Abstract

Multi-view learning studies how several views, different feature representations, of the same objects could be best utilized in learning. In other words, multi-view learning is analysis of co-occurrence data, where the observations are co-occurrences of samples in the views. Standard multi-view learning such as joint density modeling cannot be done in the absence of co-occurrence, when the views are observed separately and the identities of objects are not known. As a practical example, joint analysis of mRNA and protein concentrations requires mapping between genes and proteins. We introduce a data-driven approach for learning the correspondence of the observations in the different views, in order to enable joint analysis also in the absence of known co-occurrence. The method finds a matching that maximizes statistical dependency between the views, which is particularly suitable for multi-view methods such as canonical correlation analysis which has the same objective. We apply the method to translational metabolomics, to identify differences and commonalities in metabolic processes in different species or tissues. The metabolite identities and roles in the different species are not generally known, and it is necessary to search for a matching. In this paper we show, using different metabolomics measurement batches as the views so that the ground truth is known, that the metabolite identities can be reliably matched by a consensus of several matching solutions.
Original languageEnglish
Pages (from-to)300-321
JournalData Mining and Knowledge Discovery
Volume23
Issue number2
DOIs
Publication statusPublished - 2011
MoE publication typeA1 Journal article-refereed

Fingerprint

Metabolites
Proteins
Genes
Tissue
Metabolomics
Messenger RNA

Keywords

  • Bipartite matching
  • Canonical correlation
  • Consensus matching
  • Co-occurrence data
  • Multi-view learning

Cite this

Tripathi, A., Klami, A., Orešič, M., & Kaski, S. (2011). Matching samples of multiple views. Data Mining and Knowledge Discovery, 23(2), 300-321. https://doi.org/10.1007/s10618-010-0205-7
Tripathi, Abhishek ; Klami, Arto ; Orešič, Matej ; Kaski, Samuel. / Matching samples of multiple views. In: Data Mining and Knowledge Discovery. 2011 ; Vol. 23, No. 2. pp. 300-321.
@article{a35d1dbfe5de4b7ca856773c3a2b223a,
title = "Matching samples of multiple views",
abstract = "Multi-view learning studies how several views, different feature representations, of the same objects could be best utilized in learning. In other words, multi-view learning is analysis of co-occurrence data, where the observations are co-occurrences of samples in the views. Standard multi-view learning such as joint density modeling cannot be done in the absence of co-occurrence, when the views are observed separately and the identities of objects are not known. As a practical example, joint analysis of mRNA and protein concentrations requires mapping between genes and proteins. We introduce a data-driven approach for learning the correspondence of the observations in the different views, in order to enable joint analysis also in the absence of known co-occurrence. The method finds a matching that maximizes statistical dependency between the views, which is particularly suitable for multi-view methods such as canonical correlation analysis which has the same objective. We apply the method to translational metabolomics, to identify differences and commonalities in metabolic processes in different species or tissues. The metabolite identities and roles in the different species are not generally known, and it is necessary to search for a matching. In this paper we show, using different metabolomics measurement batches as the views so that the ground truth is known, that the metabolite identities can be reliably matched by a consensus of several matching solutions.",
keywords = "Bipartite matching, Canonical correlation, Consensus matching, Co-occurrence data, Multi-view learning",
author = "Abhishek Tripathi and Arto Klami and Matej Orešič and Samuel Kaski",
year = "2011",
doi = "10.1007/s10618-010-0205-7",
language = "English",
volume = "23",
pages = "300--321",
journal = "Data Mining and Knowledge Discovery",
issn = "1384-5810",
publisher = "Springer",
number = "2",

}

Tripathi, A, Klami, A, Orešič, M & Kaski, S 2011, 'Matching samples of multiple views', Data Mining and Knowledge Discovery, vol. 23, no. 2, pp. 300-321. https://doi.org/10.1007/s10618-010-0205-7

Matching samples of multiple views. / Tripathi, Abhishek (Corresponding Author); Klami, Arto; Orešič, Matej; Kaski, Samuel.

In: Data Mining and Knowledge Discovery, Vol. 23, No. 2, 2011, p. 300-321.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Matching samples of multiple views

AU - Tripathi, Abhishek

AU - Klami, Arto

AU - Orešič, Matej

AU - Kaski, Samuel

PY - 2011

Y1 - 2011

N2 - Multi-view learning studies how several views, different feature representations, of the same objects could be best utilized in learning. In other words, multi-view learning is analysis of co-occurrence data, where the observations are co-occurrences of samples in the views. Standard multi-view learning such as joint density modeling cannot be done in the absence of co-occurrence, when the views are observed separately and the identities of objects are not known. As a practical example, joint analysis of mRNA and protein concentrations requires mapping between genes and proteins. We introduce a data-driven approach for learning the correspondence of the observations in the different views, in order to enable joint analysis also in the absence of known co-occurrence. The method finds a matching that maximizes statistical dependency between the views, which is particularly suitable for multi-view methods such as canonical correlation analysis which has the same objective. We apply the method to translational metabolomics, to identify differences and commonalities in metabolic processes in different species or tissues. The metabolite identities and roles in the different species are not generally known, and it is necessary to search for a matching. In this paper we show, using different metabolomics measurement batches as the views so that the ground truth is known, that the metabolite identities can be reliably matched by a consensus of several matching solutions.

AB - Multi-view learning studies how several views, different feature representations, of the same objects could be best utilized in learning. In other words, multi-view learning is analysis of co-occurrence data, where the observations are co-occurrences of samples in the views. Standard multi-view learning such as joint density modeling cannot be done in the absence of co-occurrence, when the views are observed separately and the identities of objects are not known. As a practical example, joint analysis of mRNA and protein concentrations requires mapping between genes and proteins. We introduce a data-driven approach for learning the correspondence of the observations in the different views, in order to enable joint analysis also in the absence of known co-occurrence. The method finds a matching that maximizes statistical dependency between the views, which is particularly suitable for multi-view methods such as canonical correlation analysis which has the same objective. We apply the method to translational metabolomics, to identify differences and commonalities in metabolic processes in different species or tissues. The metabolite identities and roles in the different species are not generally known, and it is necessary to search for a matching. In this paper we show, using different metabolomics measurement batches as the views so that the ground truth is known, that the metabolite identities can be reliably matched by a consensus of several matching solutions.

KW - Bipartite matching

KW - Canonical correlation

KW - Consensus matching

KW - Co-occurrence data

KW - Multi-view learning

U2 - 10.1007/s10618-010-0205-7

DO - 10.1007/s10618-010-0205-7

M3 - Article

VL - 23

SP - 300

EP - 321

JO - Data Mining and Knowledge Discovery

JF - Data Mining and Knowledge Discovery

SN - 1384-5810

IS - 2

ER -