Integrating data from multiple Finnish biobanks and national health-care registers for retrospective studies: Practical experiences

Jaakko Lähteenmäki (Corresponding Author), Anna-Leena Vuorinen, Juha Pajula, Kari Harno, Mika Lehto, Mikko Niemi, Mark van Gils

    Research output: Contribution to journalArticleScientificpeer-review

    8 Citations (Scopus)


    Aim: This case study aimed to investigate the process of integrating resources of multiple biobanks and health-care registers, especially addressing data permit application, time schedules, co-operation of stakeholders, data exchange and data quality. Methods: We investigated the process in the context of a retrospective study: Pharmacogenomics of antithrombotic drugs (PreMed study). The study involved linking the genotype data of three Finnish biobanks (Auria Biobank, Helsinki Biobank and THL Biobank) with register data on medicine dispensations, health-care encounters and laboratory results. Results: We managed to collect a cohort of 7005 genotyped individuals, thereby achieving the statistical power requirements of the study. The data collection process took 16 months, exceeding our original estimate by seven months. The main delays were caused by the congested data permit approval service to access national register data on health-care encounters. Comparison of hospital data lakes and national registers revealed differences, especially concerning medication data. Genetic variant frequencies were in line with earlier data reported for the European population. The yearly number of international normalised ratio (INR) tests showed stable behaviour over time. Conclusions: A large cohort, consisting of versatile individual-level phenotype and genotype data, can be constructed by integrating data from several biobanks and health data registers in Finland. Co-operation with biobanks is straightforward. However, long time periods need to be reserved when biobank resources are linked with national register data. There is a need for efforts to define general, harmonised co-operation practices and data exchange methods for enabling efficient collection of data from multiple sources.
    Original languageEnglish
    Pages (from-to)482-489
    Number of pages8
    JournalScandinavian Journal of Public Health
    Issue number4
    Publication statusPublished - Jun 2022
    MoE publication typeA1 Journal article-refereed


    • biobank
    • health-care data
    • register data
    • real world data
    • genotype
    • secondary use
    • pharmacogenetics
    • precision medicine
    • Biobank


    Dive into the research topics of 'Integrating data from multiple Finnish biobanks and national health-care registers for retrospective studies: Practical experiences'. Together they form a unique fingerprint.

    Cite this