TY - JOUR
T1 - Integrating data from multiple Finnish biobanks and national health-care registers for retrospective studies
T2 - Practical experiences
AU - Lähteenmäki, Jaakko
AU - Vuorinen, Anna-Leena
AU - Pajula, Juha
AU - Harno, Kari
AU - Lehto, Mika
AU - Niemi, Mikko
AU - van Gils, Mark
N1 - Funding Information:
The authors disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This study was funded by Business Finland, VTT Technical Research Centre of Finland Ltd, Karl Fazer AB, Novartis Finland Oy, Pfizer Oy, Roche Diagnostics Oy, Avaintec Oy, Crown CRO Oy, Mediconsult Oy and Biobank Cooperative Finland.
Funding Information:
The study used data obtained from Auria Biobank (study number: AB19-9833), Helsinki Biobank (study number: HBP20190038) and THL Biobank (study number: BB2019_6). We thank all study participants for their generous participation in the biobank research. We also thank Merja Perälä, Perttu Terho and Mikko Tukiainen of Auria Biobank; Theresa Knopp, Miika Koskinen and Otto Manninen from Helsinki Biobank; and Niina Eklund, Anni Joensuu and Katariina Peltonen from THL Biobank for their contribution in defining the sub-cohorts for the study. We also thank the contribution of Medaffcon Oy researchers, Tanja Nieminen and Maija Wolf, for support in defining the data needs for health-care resource usage evaluation. The authors disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This study was funded by Business Finland, VTT Technical Research Centre of Finland Ltd, Karl Fazer AB, Novartis Finland Oy, Pfizer Oy, Roche Diagnostics Oy, Avaintec Oy, Crown CRO Oy, Mediconsult Oy and Biobank Cooperative Finland.
Publisher Copyright:
© Author(s) 2021.
PY - 2022/6
Y1 - 2022/6
N2 - Aim: This case study aimed to investigate the process of integrating resources of multiple biobanks and health-care registers, especially addressing data permit application, time schedules, co-operation of stakeholders, data exchange and data quality. Methods: We investigated the process in the context of a retrospective study: Pharmacogenomics of antithrombotic drugs (PreMed study). The study involved linking the genotype data of three Finnish biobanks (Auria Biobank, Helsinki Biobank and THL Biobank) with register data on medicine dispensations, health-care encounters and laboratory results. Results: We managed to collect a cohort of 7005 genotyped individuals, thereby achieving the statistical power requirements of the study. The data collection process took 16 months, exceeding our original estimate by seven months. The main delays were caused by the congested data permit approval service to access national register data on health-care encounters. Comparison of hospital data lakes and national registers revealed differences, especially concerning medication data. Genetic variant frequencies were in line with earlier data reported for the European population. The yearly number of international normalised ratio (INR) tests showed stable behaviour over time. Conclusions: A large cohort, consisting of versatile individual-level phenotype and genotype data, can be constructed by integrating data from several biobanks and health data registers in Finland. Co-operation with biobanks is straightforward. However, long time periods need to be reserved when biobank resources are linked with national register data. There is a need for efforts to define general, harmonised co-operation practices and data exchange methods for enabling efficient collection of data from multiple sources.
AB - Aim: This case study aimed to investigate the process of integrating resources of multiple biobanks and health-care registers, especially addressing data permit application, time schedules, co-operation of stakeholders, data exchange and data quality. Methods: We investigated the process in the context of a retrospective study: Pharmacogenomics of antithrombotic drugs (PreMed study). The study involved linking the genotype data of three Finnish biobanks (Auria Biobank, Helsinki Biobank and THL Biobank) with register data on medicine dispensations, health-care encounters and laboratory results. Results: We managed to collect a cohort of 7005 genotyped individuals, thereby achieving the statistical power requirements of the study. The data collection process took 16 months, exceeding our original estimate by seven months. The main delays were caused by the congested data permit approval service to access national register data on health-care encounters. Comparison of hospital data lakes and national registers revealed differences, especially concerning medication data. Genetic variant frequencies were in line with earlier data reported for the European population. The yearly number of international normalised ratio (INR) tests showed stable behaviour over time. Conclusions: A large cohort, consisting of versatile individual-level phenotype and genotype data, can be constructed by integrating data from several biobanks and health data registers in Finland. Co-operation with biobanks is straightforward. However, long time periods need to be reserved when biobank resources are linked with national register data. There is a need for efforts to define general, harmonised co-operation practices and data exchange methods for enabling efficient collection of data from multiple sources.
KW - biobank
KW - health-care data
KW - register data
KW - real world data
KW - genotype
KW - secondary use
KW - pharmacogenetics
KW - precision medicine
KW - Biobank
UR - http://www.scopus.com/inward/record.url?scp=85104306178&partnerID=8YFLogxK
U2 - 10.1177/14034948211004421
DO - 10.1177/14034948211004421
M3 - Article
C2 - 33845693
SN - 1403-4948
VL - 50
SP - 482
EP - 489
JO - Scandinavian Journal of Public Health
JF - Scandinavian Journal of Public Health
IS - 4
ER -