Data integration, pathway analysis and mining for systems biology: Dissertation

Research output: ThesisDissertationCollection of Articles


Post-genomic molecular biology embodies high-throughput experimental tech-niques and hence is a data-rich field. The goal of this thesis is to develop bioin-formatics methods to utilise publicly available data in order to produce knowl-edge and to aid mining of newly generated data. As an example of knowledge or hypothesis generation, consider function prediction of biological molecules. Assignment of protein function is a non-trivial task owing to the fact that the same protein may be involved in different biological processes, depending on the state of the biological system and protein localisation. The function of a gene or a gene product may be provided as a textual description in a gene or protein annotation database. Such textual descriptions lack in providing the contextual meaning of the gene function. Therefore, we need ways to represent the meaning in a formal way. Here we apply data integration approach to provide rich repre-sentation that enables context-sensitive mining of biological data in terms of integrated networks and conceptual spaces. Context-sensitive gene function an-notation follows naturally from this framework, as a particular application. Next, knowledge that is already publicly available can be used to aid mining of new experimental data. We developed an integrative bioinformatics method that util-ises publicly available knowledge of protein-protein interactions, metabolic net-works and transcriptional regulatory networks to analyse transcriptomics data and predict altered biological processes. We applied this method to a study of dynamic response of Saccharomyces cerevisiae to oxidative stress. The applica-tion of our method revealed dynamically altered biological functions in response to oxidative stress, which were validated by comprehensive in vivo metabolom-ics experiments. The results provided in this thesis indicate that integration of heterogeneous biological data facilitates advanced mining of the data. The meth-ods can be applied for gaining insight into functions of genes, gene products and other molecules, as well as for offering functional interpretation to transcriptom-ics and metabolomics experiments.
Original languageEnglish
QualificationDoctor Degree
Awarding Institution
  • Aalto University
  • Kaski, Kimmo, Supervisor, External person
Award date14 May 2010
Place of PublicationEspoo
Print ISBNs978-951-38-7385-1
Electronic ISBNs978-951-38-7386-8
Publication statusPublished - 2010
MoE publication typeG5 Doctoral dissertation (article)


  • systems biology
  • high-throughput data
  • data integration
  • data mining
  • visualisation
  • bioinformatics
  • conceptual spaces
  • network topology


Dive into the research topics of 'Data integration, pathway analysis and mining for systems biology: Dissertation'. Together they form a unique fingerprint.

Cite this