• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于汇总统计量的惩罚回归多基因评分。

Polygenic scores via penalized regression on summary statistics.

作者信息

Mak Timothy Shin Heng, Porsch Robert Milan, Choi Shing Wan, Zhou Xueya, Sham Pak Chung

机构信息

Centre for Genomic Sciences, University of Hong Kong, Hong Kong.

Department of Psychiatry, University of Hong Kong, Hong Kong.

出版信息

Genet Epidemiol. 2017 Sep;41(6):469-480. doi: 10.1002/gepi.22050. Epub 2017 May 8.

DOI:10.1002/gepi.22050
PMID:28480976
Abstract

Polygenic scores (PGS) summarize the genetic contribution of a person's genotype to a disease or phenotype. They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating PGS have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD) in summary statistics, a pertinent question is how we can use LD information available elsewhere to supplement such analyses. To answer this question, we propose a method for constructing PGS using summary statistics and a reference panel in a penalized regression framework, which we call lassosum. We also propose a general method for choosing the value of the tuning parameter in the absence of validation data. In our simulations, we showed that pseudovalidation often resulted in prediction accuracy that is comparable to using a dataset with validation phenotype and was clearly superior to the conservative option of setting the tuning parameter of lassosum to its lowest value. We also showed that lassosum achieved better prediction accuracy than simple clumping and P-value thresholding in almost all scenarios. It was also substantially faster and more accurate than the recently proposed LDpred.

摘要

多基因分数(PGS)总结了一个人的基因型对疾病或表型的遗传贡献。它们可用于将参与者分为不同的疾病风险类别,也可用作流行病学分析中的协变量。已经提出了许多计算PGS的可能方法,最近人们对纳入已发表汇总统计信息的方法非常感兴趣。由于汇总统计中没有关于连锁不平衡(LD)的内在信息,一个相关的问题是我们如何利用其他地方可用的LD信息来补充此类分析。为了回答这个问题,我们提出了一种在惩罚回归框架中使用汇总统计和参考面板构建PGS的方法,我们称之为lassosum。我们还提出了一种在没有验证数据的情况下选择调整参数值的通用方法。在我们的模拟中,我们表明伪验证通常会产生与使用具有验证表型的数据集相当的预测准确性,并且明显优于将lassosum的调整参数设置为其最低值的保守选项。我们还表明,在几乎所有情况下,lassosum都比简单的聚类和P值阈值化具有更好的预测准确性。它也比最近提出的LDpred快得多且更准确。

相似文献

1
Polygenic scores via penalized regression on summary statistics.基于汇总统计量的惩罚回归多基因评分。
Genet Epidemiol. 2017 Sep;41(6):469-480. doi: 10.1002/gepi.22050. Epub 2017 May 8.
2
Evaluation of polygenic prediction methodology within a reference-standardized framework.在参考标准化框架内评估多基因预测方法。
PLoS Genet. 2021 May 4;17(5):e1009021. doi: 10.1371/journal.pgen.1009021. eCollection 2021 May.
3
Penalized regression and model selection methods for polygenic scores on summary statistics.基于汇总统计的多基因评分的惩罚回归和模型选择方法。
PLoS Comput Biol. 2020 Oct 1;16(10):e1008271. doi: 10.1371/journal.pcbi.1008271. eCollection 2020 Oct.
4
Making the Most of Clumping and Thresholding for Polygenic Scores.充分利用聚类和阈值处理多基因评分。
Am J Hum Genet. 2019 Dec 5;105(6):1213-1221. doi: 10.1016/j.ajhg.2019.11.001. Epub 2019 Nov 21.
5
Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores.连锁不平衡建模提高了多基因风险评分的准确性。
Am J Hum Genet. 2015 Oct 1;97(4):576-92. doi: 10.1016/j.ajhg.2015.09.001.
6
A flexible and parallelizable approach to genome-wide polygenic risk scores.一种灵活且可并行化的全基因组多基因风险评分方法。
Genet Epidemiol. 2019 Oct;43(7):730-741. doi: 10.1002/gepi.22245. Epub 2019 Jul 22.
7
POLARIS: Polygenic LD-adjusted risk score approach for set-based analysis of GWAS data.POLARIS:用于全基因组关联研究(GWAS)数据基于集合分析的多基因连锁不平衡调整风险评分方法。
Genet Epidemiol. 2018 Jun;42(4):366-377. doi: 10.1002/gepi.22117. Epub 2018 Mar 12.
8
Multivariate extension of penalized regression on summary statistics to construct polygenic risk scores for correlated traits.基于汇总统计量的惩罚回归的多元扩展,以构建相关性状的多基因风险评分。
HGG Adv. 2023 May 20;4(3):100209. doi: 10.1016/j.xhgg.2023.100209. eCollection 2023 Jul 13.
9
Non-parametric Polygenic Risk Prediction via Partitioned GWAS Summary Statistics.基于分区 GWAS 汇总统计量的非参数多基因风险预测。
Am J Hum Genet. 2020 Jul 2;107(1):46-59. doi: 10.1016/j.ajhg.2020.05.004. Epub 2020 May 28.
10
Improved polygenic prediction by Bayesian multiple regression on summary statistics.基于汇总统计数据的贝叶斯多元回归提高多基因预测能力。
Nat Commun. 2019 Nov 8;10(1):5086. doi: 10.1038/s41467-019-12653-0.

引用本文的文献

1
An Efficient Lasso Framework for Admixture-Aware Polygenic Scores.一种用于混合体感知多基因分数的高效套索框架。
bioRxiv. 2025 Aug 27:2025.08.26.671106. doi: 10.1101/2025.08.26.671106.
2
Polygenic Risk Scores for Pediatric Obsessive-Compulsive Symptoms and their Mediating Effect in Clinically Diagnosed Samples of Obsessive-Compulsive Disorder, Attention-Deficit/Hyperactivity Disorder, Anxiety, Depression, Autism and Tourette syndrome.儿童强迫症症状的多基因风险评分及其在强迫症、注意力缺陷/多动障碍、焦虑症、抑郁症、自闭症和抽动秽语综合征临床诊断样本中的中介作用。
Res Sq. 2025 Aug 6:rs.3.rs-7115885. doi: 10.21203/rs.3.rs-7115885/v1.
3
Recall by polygenic risk score in two biobanks supports a genomic approach for glaucoma detection.
两个生物样本库中通过多基因风险评分进行的召回支持了一种用于青光眼检测的基因组方法。
Res Sq. 2025 Aug 5:rs.3.rs-7159368. doi: 10.21203/rs.3.rs-7159368/v1.
4
LDAK-KVIK performs fast and powerful mixed-model association analysis of quantitative and binary phenotypes.LDAK-KVIK对定量和二元表型进行快速且强大的混合模型关联分析。
Nat Genet. 2025 Aug 11. doi: 10.1038/s41588-025-02286-z.
5
Variational autoencoder-based model improves polygenic prediction in blood cell traits.基于变分自编码器的模型改进了血细胞性状的多基因预测。
HGG Adv. 2025 Aug 8;6(4):100490. doi: 10.1016/j.xhgg.2025.100490.
6
A statistical view of column subset selection.列子集选择的统计视角。
J R Stat Soc Series B Stat Methodol. 2025 May 16. doi: 10.1093/jrsssb/qkaf023.
7
Single-cell polygenic risk scores dissect cellular and molecular heterogeneity of complex human diseases.单细胞多基因风险评分剖析复杂人类疾病的细胞和分子异质性。
Nat Biotechnol. 2025 Jul 25. doi: 10.1038/s41587-025-02725-6.
8
Impact of genetic risk and lifestyles on cardiovascular disease-free and total life expectancy: a cohort study.遗传风险和生活方式对无心血管疾病预期寿命和总预期寿命的影响:一项队列研究。
Genome Med. 2025 Jul 22;17(1):81. doi: 10.1186/s13073-025-01487-9.
9
MetaGeno: a chromosome-wise multi-task genomic framework for ischaemic stroke risk prediction.MetaGeno:一种用于缺血性中风风险预测的染色体级多任务基因组框架。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf348.
10
PGSFusion streamlines polygenic score construction and epidemiological applications in biobank-scale cohorts.PGSFusion简化了生物样本库规模队列中的多基因评分构建和流行病学应用。
Genome Med. 2025 Jul 14;17(1):77. doi: 10.1186/s13073-025-01505-w.