TY - JOUR
T1 - Data integration and visualization system for enabling conceptual biology
AU - Gopalacharyulu, Peddinti V.
AU - Lindfors, Erno
AU - Bounsaythip, Catherine
AU - Kivioja, Teemu
AU - Yetukuri, Laxman
AU - Hollmén, Jaakko
AU - Orešič, Matej
PY - 2005
Y1 - 2005
N2 - Motivation: Integration of heterogeneous data in
life sciences is a growing and recognized challenge. The problem is not
only to enable the study of such data within the context of a biological
question but also more fundamentally, how to represent the available
knowledge and make it accessible for mining.Results:
Our integration approach is based on the premise that relationships
between biological entities can be represented as a complex network. The
context dependency is achieved by a judicious use of distance measures
on these networks. The biological entities and the distances between
them are mapped for the purpose of visualization into the lower
dimensional space using the Sammon's mapping. The system implementation
is based on a multi-tier architecture using a native XML database and a
software tool for querying and visualizing complex biological networks.
The functionality of our system is demonstrated with two examples: (1) A
multiple pathway retrieval, in which, given a pathway name, the system
finds all the relationships related to the query by checking available
metabolic pathway, transcriptional, signaling, protein–protein
interaction and ontology annotation resources and (2) A protein
neighborhood search, in which given a protein name, the system finds all
its connected entities within a specified depth. These two examples
show that our system is able to conceptually traverse different
databases to produce testable hypotheses and lead towards answers to
complex biological questions.
AB - Motivation: Integration of heterogeneous data in
life sciences is a growing and recognized challenge. The problem is not
only to enable the study of such data within the context of a biological
question but also more fundamentally, how to represent the available
knowledge and make it accessible for mining.Results:
Our integration approach is based on the premise that relationships
between biological entities can be represented as a complex network. The
context dependency is achieved by a judicious use of distance measures
on these networks. The biological entities and the distances between
them are mapped for the purpose of visualization into the lower
dimensional space using the Sammon's mapping. The system implementation
is based on a multi-tier architecture using a native XML database and a
software tool for querying and visualizing complex biological networks.
The functionality of our system is demonstrated with two examples: (1) A
multiple pathway retrieval, in which, given a pathway name, the system
finds all the relationships related to the query by checking available
metabolic pathway, transcriptional, signaling, protein–protein
interaction and ontology annotation resources and (2) A protein
neighborhood search, in which given a protein name, the system finds all
its connected entities within a specified depth. These two examples
show that our system is able to conceptually traverse different
databases to produce testable hypotheses and lead towards answers to
complex biological questions.
KW - xml
UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-29144514388&partnerID=MN8TOARS
U2 - 10.1093/bioinformatics/bti1015
DO - 10.1093/bioinformatics/bti1015
M3 - Article
SN - 1367-4803
VL - 21
SP - i177-i185
JO - Bioinformatics
JF - Bioinformatics
IS - Suppl. 1
ER -