Dutta Diptavo, Chatterjee Nilanjan
Integrative Tumor Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD, 20879, United States.
Department of Biostatistics, Johns Hopkins University, 615 N Wolfe Street, Baltimore, MD, 21205, United States.
Hum Mol Genet. 2025 May 2. doi: 10.1093/hmg/ddaf054.
Biobanks have become pivotal in genetic research, particularly through genome-wide association studies (GWAS), driving transformative insights into the genetic basis of complex diseases and traits through the integration of genetic data with phenotypic, environmental, family history, and behavioral information. This review explores the distinct design and utility of different biobanks, highlighting their unique contributions to genetic research. We further discuss the utility and methodological advances in combining data from disease-specific study or consortia with that of biobanks, especially focusing on summary statistics based meta-analysis. Subsequently we review the spectrum of additional advantages offered by biobanks in genetic studies in representing population differences, calibration of polygenic scores, assessment of pleiotropy and improving post-GWAS in silico analyses. Advances in sequencing technologies, particularly whole-exome and whole-genome sequencing, have further enabled the discovery of rare variants at biobank scale. Among recent developments, the integration of large-scale multi-omics data especially proteomics and metabolomics, within biobanks provides deeper insights into disease mechanisms and regulatory pathways. Despite challenges like ascertainment strategies and phenotypic misclassification, biobanks continue to evolve, driving methodological innovation and enabling precision medicine. We highlight the contributions of biobanks to genetic research, their growing integration with multi-omics, and finally discuss their future potential for advancing healthcare and therapeutic development.
生物样本库在基因研究中已变得至关重要,特别是通过全基因组关联研究(GWAS),通过将基因数据与表型、环境、家族史和行为信息相结合,推动了对复杂疾病和性状的遗传基础的变革性认识。本综述探讨了不同生物样本库的独特设计和用途,强调了它们对基因研究的独特贡献。我们进一步讨论了将疾病特异性研究或联盟的数据与生物样本库的数据相结合的用途和方法学进展,特别关注基于汇总统计的荟萃分析。随后,我们回顾了生物样本库在基因研究中在代表人群差异、多基因评分校准、多效性评估以及改进GWAS后计算机分析方面提供的一系列额外优势。测序技术的进步,特别是全外显子组和全基因组测序,进一步使得在生物样本库规模上发现罕见变异成为可能。在最近的发展中,生物样本库内大规模多组学数据(特别是蛋白质组学和代谢组学)的整合为疾病机制和调控途径提供了更深入的见解。尽管存在确定策略和表型错误分类等挑战,但生物样本库仍在不断发展,推动方法学创新并实现精准医学。我们强调了生物样本库对基因研究的贡献、它们与多组学的日益整合,最后讨论了它们在推进医疗保健和治疗开发方面的未来潜力。