遗传评分图谱预测多组学特征
An atlas of genetic scores to predict multi-omic traits.
机构信息
Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
British Heart Foundation Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK.
出版信息
Nature. 2023 Apr;616(7955):123-131. doi: 10.1038/s41586-023-05844-9. Epub 2023 Mar 29.
The use of omic modalities to dissect the molecular underpinnings of common diseases and traits is becoming increasingly common. But multi-omic traits can be genetically predicted, which enables highly cost-effective and powerful analyses for studies that do not have multi-omics. Here we examine a large cohort (the INTERVAL study; n = 50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, n = 3,175; Olink, n = 4,822), plasma metabolomics (Metabolon HD4, n = 8,153), serum metabolomics (Nightingale, n = 37,359) and whole-blood Illumina RNA sequencing (n = 4,136), and use machine learning to train genetic scores for 17,227 molecular traits, including 10,521 that reach Bonferroni-adjusted significance. We evaluate the performance of genetic scores through external validation across cohorts of individuals of European, Asian and African American ancestries. In addition, we show the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of the UK Biobank to identify disease associations using a phenome-wide scan. We highlight a series of biological insights with regard to genetic mechanisms in metabolism and canonical pathway associations with disease; for example, JAK-STAT signalling and coronary atherosclerosis. Finally, we develop a portal ( https://www.omicspred.org/ ) to facilitate public access to all genetic scores and validation results, as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.
利用组学方法来剖析常见疾病和特征的分子基础正变得越来越普遍。但是,多组学特征可以进行遗传预测,这使得那些没有多组学数据的研究能够进行非常具有成本效益且强大的分析。在这里,我们研究了一个具有广泛多组学数据的大型队列(INTERVAL 研究;n=50000 名参与者),其中包括血浆蛋白质组学(SomaScan,n=3175;Olink,n=4822)、血浆代谢组学(Metabolon HD4,n=8153)、血清代谢组学(Nightingale,n=37359)和全血 Illumina RNA 测序(n=4136),并使用机器学习为 17227 个分子特征训练遗传评分,其中包括 10521 个达到 Bonferroni 校正显著性的特征。我们通过在欧洲、亚洲和非裔美国人血统的个体队列中进行外部验证来评估遗传评分的性能。此外,我们通过量化生物途径的遗传控制以及通过生成英国生物库的综合多组学数据集来识别疾病关联,展示了这些多组学遗传评分的实用性,该数据集使用了全表型扫描。我们强调了一系列与代谢和与疾病相关的经典途径的遗传机制相关的生物学见解;例如,JAK-STAT 信号转导和冠状动脉粥样硬化。最后,我们开发了一个门户(https://www.omicspred.org/),以方便公众访问所有遗传评分和验证结果,并作为未来多组学遗传评分扩展和增强的平台。
相似文献
Nature. 2023-4
mSystems. 2024-1-23
BMC Med Inform Decis Mak. 2024-5-2
Expert Rev Proteomics. 2025-4
引用本文的文献
Curr Issues Mol Biol. 2025-7-29
Imaging Neurosci (Camb). 2024-1-29
本文引用的文献
Nat Genet. 2023-2
Hum Mol Genet. 2022-11-28
Science. 2021-11-12