Suppr超能文献

交叉验证遗传预测在多基因风险评分和线性混合模型中的高效估计及应用。

Efficient Estimation and Applications of Cross-Validated Genetic Predictions to Polygenic Risk Scores and Linear Mixed Models.

机构信息

Neurology, UCLA, Los Angeles, California.

School of Medicine, UCSF, San Francisco, California.

出版信息

J Comput Biol. 2020 Apr;27(4):599-612. doi: 10.1089/cmb.2019.0325. Epub 2020 Feb 20.

Abstract

Large-scale cohorts with combined genetic and phenotypic data, coupled with methodological advances, have produced increasingly accurate genetic predictors of complex human phenotypes called polygenic risk scores (PRSs). In addition to the potential translational impacts of identifying at-risk individuals, PRS are being utilized for a growing list of scientific applications, including causal inference, identifying pleiotropy and genetic correlation, and powerful gene-based and mixed-model association tests. Existing PRS approaches rely on external large-scale genetic cohorts that have also measured the phenotype of interest. They further require matching on ancestry and genotyping platform or imputation quality. In this work, we present a novel reference-free method to produce a PRS that does not rely on an external cohort. We show that naive implementations of reference-free PRS either result in substantial overfitting or prohibitive increases in computational time. We show that our algorithm avoids both of these issues and can produce informative in-sample PRSs over a single cohort without overfitting. We then demonstrate several novel applications of reference-free PRSs, including detection of pleiotropy across 246 metabolic traits and efficient mixed-model association testing.

摘要

大规模的队列研究结合遗传和表型数据,再加上方法学的进步,已经产生了越来越准确的预测复杂人类表型的遗传指标,称为多基因风险评分(PRS)。除了识别高危个体的潜在转化影响外,PRS 还被用于越来越多的科学应用,包括因果推断、识别多效性和遗传相关性,以及强大的基于基因和混合模型关联测试。现有的 PRS 方法依赖于外部大规模的遗传队列,这些队列也测量了感兴趣的表型。它们还需要在祖先和基因分型平台或 imputation 质量上进行匹配。在这项工作中,我们提出了一种新的无参考方法来生成 PRS,而不依赖于外部队列。我们表明,无参考 PRS 的简单实现要么导致严重的过拟合,要么导致计算时间显著增加。我们表明,我们的算法避免了这两个问题,并且可以在单个队列中产生无过拟合的信息丰富的样本内 PRS。然后,我们展示了无参考 PRS 的几个新应用,包括在 246 个代谢特征中检测多效性和高效的混合模型关联测试。

相似文献

3
The construction of cross-population polygenic risk scores using transfer learning.使用迁移学习构建跨人群多基因风险评分。
Am J Hum Genet. 2022 Nov 3;109(11):1998-2008. doi: 10.1016/j.ajhg.2022.09.010. Epub 2022 Oct 13.

引用本文的文献

10
Post-GWAS knowledge gap: the how, where, and when.全基因组关联研究后的知识空白:如何、何处及何时。
NPJ Parkinsons Dis. 2020 Sep 9;6:23. doi: 10.1038/s41531-020-00125-y. eCollection 2020.

本文引用的文献

10
Using Genetic Distance to Infer the Accuracy of Genomic Prediction.利用遗传距离推断基因组预测的准确性。
PLoS Genet. 2016 Sep 2;12(9):e1006288. doi: 10.1371/journal.pgen.1006288. eCollection 2016 Sep.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验