Suppr超能文献

基于汇总统计量的惩罚回归多基因评分。

Polygenic scores via penalized regression on summary statistics.

作者信息

Mak Timothy Shin Heng, Porsch Robert Milan, Choi Shing Wan, Zhou Xueya, Sham Pak Chung

机构信息

Centre for Genomic Sciences, University of Hong Kong, Hong Kong.

Department of Psychiatry, University of Hong Kong, Hong Kong.

出版信息

Genet Epidemiol. 2017 Sep;41(6):469-480. doi: 10.1002/gepi.22050. Epub 2017 May 8.

Abstract

Polygenic scores (PGS) summarize the genetic contribution of a person's genotype to a disease or phenotype. They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating PGS have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD) in summary statistics, a pertinent question is how we can use LD information available elsewhere to supplement such analyses. To answer this question, we propose a method for constructing PGS using summary statistics and a reference panel in a penalized regression framework, which we call lassosum. We also propose a general method for choosing the value of the tuning parameter in the absence of validation data. In our simulations, we showed that pseudovalidation often resulted in prediction accuracy that is comparable to using a dataset with validation phenotype and was clearly superior to the conservative option of setting the tuning parameter of lassosum to its lowest value. We also showed that lassosum achieved better prediction accuracy than simple clumping and P-value thresholding in almost all scenarios. It was also substantially faster and more accurate than the recently proposed LDpred.

摘要

多基因分数(PGS)总结了一个人的基因型对疾病或表型的遗传贡献。它们可用于将参与者分为不同的疾病风险类别,也可用作流行病学分析中的协变量。已经提出了许多计算PGS的可能方法,最近人们对纳入已发表汇总统计信息的方法非常感兴趣。由于汇总统计中没有关于连锁不平衡(LD)的内在信息,一个相关的问题是我们如何利用其他地方可用的LD信息来补充此类分析。为了回答这个问题,我们提出了一种在惩罚回归框架中使用汇总统计和参考面板构建PGS的方法,我们称之为lassosum。我们还提出了一种在没有验证数据的情况下选择调整参数值的通用方法。在我们的模拟中,我们表明伪验证通常会产生与使用具有验证表型的数据集相当的预测准确性,并且明显优于将lassosum的调整参数设置为其最低值的保守选项。我们还表明,在几乎所有情况下,lassosum都比简单的聚类和P值阈值化具有更好的预测准确性。它也比最近提出的LDpred快得多且更准确。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验