• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于多基因评分的高效分组套索回归及其在“我们所有人”项目和英国生物银行中的应用

Efficient blockLASSO for polygenic scores with applications to all of us and UK Biobank.

作者信息

Raben Timothy G, Lello Louis, Widen Erik, Hsu Stephen D H

机构信息

Department of Physics and Astronomy, Michigan State University, East Lansing, USA.

Genomic Prediction, Inc., North Brunswick, NJ, USA.

出版信息

BMC Genomics. 2025 Mar 27;26(1):302. doi: 10.1186/s12864-025-11505-0.

DOI:10.1186/s12864-025-11505-0
PMID:40148775
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11948729/
Abstract

We develop a "block" LASSO (blockLASSO) approach for training polygenic scores (PGS) and demonstrate its use in All of Us (AoU) and the UK Biobank (UKB). blockLASSO utilizes the approximate block diagonal structure (due to chromosomal partition of the genome) of linkage disequilibrium (LD). The new implementation can be used for exploratory and methods research where repeated PGS training is necessary and expensive. For 11 different phenotypes, in two different biobanks, and across 5 different ancestry groups (African, American, East Asian, European, and South Asian) - we demonstrate that blockLASSO is generally as effective for training PGS as a (global) LASSO. Previous work has shown penalized regression methods produce competitive PGS to alternative approaches. It has been shown that some phenotypes are more/less polygenic than others. Using sparse algorithms, an accurate PGS can be trained for type 1 diabetes (T1D) using single nucleotide variants (SNVs), but a PGS for body mass index (BMI) would need more than 10k SNVs. blockLASSO produces similar PGS for phenotypes while training with just a fraction of the variants per block. Within AoU (using only genetic information) block PGS for T1D reaches an AUC of and for BMI a correlation of , whereas a global LASSO approach which finds for T1D an AUC and BMI a correlation . This new block approach is more computationally efficient and scalable than naive global machine learning approaches and makes it ideal for exploratory methods investigations based on penalized regression.

摘要

我们开发了一种用于训练多基因分数(PGS)的“块”套索法(blockLASSO),并展示了其在“我们所有人”(AoU)和英国生物银行(UKB)中的应用。blockLASSO利用了连锁不平衡(LD)的近似块对角结构(由于基因组的染色体划分)。这种新方法可用于需要重复进行PGS训练且成本高昂的探索性研究和方法研究。对于11种不同的表型、两个不同的生物银行以及5个不同的祖先群体(非洲、美洲、东亚、欧洲和南亚),我们证明了blockLASSO在训练PGS方面通常与(全局)套索法一样有效。先前的工作表明,惩罚回归方法能产生与其他方法相竞争的PGS。研究表明,某些表型的多基因性比其他表型更强或更弱。使用稀疏算法,仅使用单核苷酸变异(SNV)就能为1型糖尿病(T1D)训练出准确的PGS,但身体质量指数(BMI)的PGS则需要超过10,000个SNV。blockLASSO在每个块仅使用一小部分变异进行训练时,就能为各种表型产生相似的PGS。在AoU中(仅使用遗传信息),T1D的块PGS的曲线下面积(AUC)达到了[具体数值未给出],BMI的相关性达到了[具体数值未给出],而全局套索法得到的T1D的AUC为[具体数值未给出],BMI的相关性为[具体数值未给出]。这种新的块方法比简单的全局机器学习方法在计算上更高效且更具扩展性,使其成为基于惩罚回归的探索性方法研究的理想选择。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/98fc8264d0f0/12864_2025_11505_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/9213481ecf30/12864_2025_11505_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/1c7e32f7caef/12864_2025_11505_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/7648e9a5e511/12864_2025_11505_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/8b19c124fd02/12864_2025_11505_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/98fc8264d0f0/12864_2025_11505_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/9213481ecf30/12864_2025_11505_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/1c7e32f7caef/12864_2025_11505_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/7648e9a5e511/12864_2025_11505_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/8b19c124fd02/12864_2025_11505_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec08/11948729/98fc8264d0f0/12864_2025_11505_Fig5_HTML.jpg

相似文献

1
Efficient blockLASSO for polygenic scores with applications to all of us and UK Biobank.用于多基因评分的高效分组套索回归及其在“我们所有人”项目和英国生物银行中的应用
BMC Genomics. 2025 Mar 27;26(1):302. doi: 10.1186/s12864-025-11505-0.
2
Evaluation of polygenic scoring methods in five biobanks shows larger variation between biobanks than methods and finds benefits of ensemble learning.五项生物库中多基因评分方法的评估显示,生物库之间的差异大于方法之间的差异,并发现了集成学习的益处。
Am J Hum Genet. 2024 Jul 11;111(7):1431-1447. doi: 10.1016/j.ajhg.2024.06.003. Epub 2024 Jun 21.
3
A fast and scalable framework for large-scale and ultrahigh-dimensional sparse regression with application to the UK Biobank.一种快速且可扩展的大规模超高维稀疏回归框架及其在 UK Biobank 中的应用。
PLoS Genet. 2020 Oct 23;16(10):e1009141. doi: 10.1371/journal.pgen.1009141. eCollection 2020 Oct.
4
Calibrated prediction intervals for polygenic scores across diverse contexts.在不同环境下对多基因评分进行校准预测区间。
Nat Genet. 2024 Jul;56(7):1386-1396. doi: 10.1038/s41588-024-01792-w. Epub 2024 Jun 17.
5
Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort.245 个多基因评分在英国生物样本库中得出并应用于来自同一队列的 9 个祖先群体时的可转移性。
Am J Hum Genet. 2022 Jan 6;109(1):12-23. doi: 10.1016/j.ajhg.2021.11.008.
6
Power of inclusion: Enhancing polygenic prediction with admixed individuals.包容性的力量:利用混合个体增强多基因预测。
Am J Hum Genet. 2023 Nov 2;110(11):1888-1902. doi: 10.1016/j.ajhg.2023.09.013. Epub 2023 Oct 27.
7
Analysis of common genetic variation and rare CNVs in the Australian Autism Biobank.澳大利亚自闭症生物样本库中常见遗传变异与罕见 CNVs 的分析。
Mol Autism. 2021 Feb 10;12(1):12. doi: 10.1186/s13229-020-00407-5.
8
Polygenic Scoring for Detection of Ascending Thoracic Aortic Dilation.多基因风险评分检测升主动脉扩张。
Circ Genom Precis Med. 2024 Oct;17(5):e004512. doi: 10.1161/CIRCGEN.123.004512. Epub 2024 Sep 26.
9
The relationship between 11 different polygenic longevity scores, parental lifespan, and disease diagnosis in the UK Biobank.11 种不同的多基因长寿评分与 UK Biobank 中父母寿命和疾病诊断之间的关系。
Geroscience. 2024 Aug;46(4):3911-3927. doi: 10.1007/s11357-024-01107-1. Epub 2024 Mar 7.
10
Alcohol Use Disorder Polygenic Score Compared With Family History and ADH1B.酒精使用障碍多基因评分与家族史及乙醇脱氢酶1B的比较。
JAMA Netw Open. 2024 Dec 2;7(12):e2452705. doi: 10.1001/jamanetworkopen.2024.52705.

本文引用的文献

1
Polygenic risk score adds to a clinical risk score in the prediction of cardiovascular disease in a clinical setting.多基因风险评分可增加临床风险评分,有助于临床环境下预测心血管疾病。
Eur Heart J. 2024 Sep 7;45(34):3152-3160. doi: 10.1093/eurheartj/ehae342.
2
Health economic analysis of polygenic risk score use in primary prevention of coronary artery disease - A system dynamics model.多基因风险评分用于冠心病一级预防的健康经济分析——一个系统动力学模型
Am J Prev Cardiol. 2024 May 18;18:100672. doi: 10.1016/j.ajpc.2024.100672. eCollection 2024 Jun.
3
Genome‑wide association study and polygenic risk scores predict psoriasis and its shared phenotypes in Taiwan.
全基因组关联研究和多基因风险评分预测台湾的银屑病及其共享表型。
Mol Med Rep. 2024 Jul;30(1). doi: 10.3892/mmr.2024.13239. Epub 2024 May 17.
4
Return of polygenic risk scores in research: Stakeholders' views on the eMERGE-IV study.多基因风险评分在研究中的回归:利益相关者对 eMERGE-IV 研究的看法。
HGG Adv. 2024 Apr 11;5(2):100281. doi: 10.1016/j.xhgg.2024.100281. Epub 2024 Feb 27.
5
Recent advances in polygenic scores: translation, equitability, methods and FAIR tools.多基因评分的最新进展:转化、公平性、方法与FAIR工具
Genome Med. 2024 Feb 19;16(1):33. doi: 10.1186/s13073-024-01304-9.
6
Ancestry-specific polygenic risk scores are risk enhancers for clinical cardiovascular disease assessments.针对特定祖先的多基因风险评分是临床心血管疾病评估的风险增强因素。
Nat Commun. 2023 Nov 4;14(1):7105. doi: 10.1038/s41467-023-42897-w.
7
Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology.跨人群的多基因预测受到血统、遗传结构和方法学的影响。
Cell Genom. 2023 Sep 14;3(10):100408. doi: 10.1016/j.xgen.2023.100408. eCollection 2023 Oct 11.
8
Integrating a Polygenic Risk Score into a clinical setting would impact risk predictions in familial breast cancer.将多基因风险评分纳入临床环境会影响家族性乳腺癌的风险预测。
J Med Genet. 2024 Jan 19;61(2):150-154. doi: 10.1136/jmg-2023-109311.
9
Polygenic scores in cancer.多基因风险评分在癌症中的应用。
Nat Rev Cancer. 2023 Sep;23(9):619-630. doi: 10.1038/s41568-023-00599-x. Epub 2023 Jul 21.
10
Biobank-scale methods and projections for sparse polygenic prediction from machine learning.基于机器学习的稀疏多基因预测的生物银行规模方法和预测。
Sci Rep. 2023 Jul 19;13(1):11662. doi: 10.1038/s41598-023-37580-5.