基于汇总统计数据的贝叶斯多元回归提高多基因预测能力。

Improved polygenic prediction by Bayesian multiple regression on summary statistics.

机构信息

Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, 4072, QLD, Australia.

Estonian Genome Center, Institute of Genomics, University of Tartu, Riia 23b, 51010, Tartu, Estonia.

出版信息

Nat Commun. 2019 Nov 8;10(1):5086. doi: 10.1038/s41467-019-12653-0.

DOI:10.1038/s41467-019-12653-0

PMID:31704910

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6841727/

Abstract

Accurate prediction of an individual's phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis (n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding.

摘要

从个体的 DNA 序列准确预测其表型是基因组学和精准医学的重大承诺之一。我们将强大的个体水平数据贝叶斯多元回归模型（BayesR）扩展为一种利用全基因组关联研究（GWAS）汇总统计数据的模型（SBayesR）。在使用来自英国生物库的 35 万名个体的 12 个真实特征和 110 万个变体进行的模拟和交叉验证中，SBayesR 提高了预测准确性，而计算资源仅为常用的最先进汇总统计数据方法的一小部分。此外，使用来自最大 GWAS 荟萃分析（n≈700000）的身高和 BMI 变体的汇总统计数据，我们表明，在跨特征和两个独立数据集的情况下，SBayesR 平均将预测 R 提高了 5.2%，与 LDpred 相比提高了 26.5%，与聚类和 p 值阈值相比提高了 26.5%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fdb/6841727/38d11633f125/41467_2019_12653_Fig1_HTML.jpg

相似文献

Improved polygenic prediction by Bayesian multiple regression on summary statistics.

Nat Commun. 2019 Nov 8;10(1):5086. doi: 10.1038/s41467-019-12653-0.

Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets.

Nat Commun. 2021 Oct 18;12(1):6052. doi: 10.1038/s41467-021-25171-9.

Fast and accurate Bayesian polygenic risk modeling with variational inference.

Am J Hum Genet. 2023 May 4;110(5):741-761. doi: 10.1016/j.ajhg.2023.03.009. Epub 2023 Apr 7.

Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries.

Nat Genet. 2024 May;56(5):767-777. doi: 10.1038/s41588-024-01704-y. Epub 2024 Apr 30.

Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction.

Am J Hum Genet. 2021 Jun 3;108(6):1001-1011. doi: 10.1016/j.ajhg.2021.04.014. Epub 2021 May 7.

Making the Most of Clumping and Thresholding for Polygenic Scores.

Am J Hum Genet. 2019 Dec 5;105(6):1213-1221. doi: 10.1016/j.ajhg.2019.11.001. Epub 2019 Nov 21.

Fine mapping and accurate prediction of complex traits using Bayesian Variable Selection models applied to biobank-size data.

Eur J Hum Genet. 2023 Mar;31(3):313-320. doi: 10.1038/s41431-022-01135-5. Epub 2022 Jul 19.

Improved genetic prediction of complex traits from individual-level data or summary statistics.

Nat Commun. 2021 Jul 7;12(1):4192. doi: 10.1038/s41467-021-24485-y.

Identity informative SNP associations in the UK Biobank.

Forensic Sci Int Genet. 2019 Sep;42:45-48. doi: 10.1016/j.fsigen.2019.06.007. Epub 2019 Jun 14.

Haplotype function score improves biological interpretation and cross-ancestry polygenic prediction of human complex traits.

Elife. 2024 Apr 19;12:RP92574. doi: 10.7554/eLife.92574.

引用本文的文献

Genetics and Socioeconomic Status: Some Preliminary Evidence on Mechanisms.

J Polit Econ Microecon. 2025 Aug;3(3). doi: 10.1086/732835. Epub 2025 Jul 16.

MIXED MODELING APPROACH FOR CHARACTERIZING THE GENETIC EFFECTS IN A LONGITUDINAL PHENOTYPE.

Ann Appl Stat. 2025 Sep;19(3):2070-2087. doi: 10.1214/25-aoas2033. Epub 2025 Aug 28.

Association between polygenic risk for Major Depression and brain structure in a mega-analysis of 50,975 participants across 11 studies.

Mol Psychiatry. 2025 Aug 19. doi: 10.1038/s41380-025-03136-4.

Genomic risk prediction for depression in a large prospective study of older adults of European descent.

Mol Psychiatry. 2025 Aug 6. doi: 10.1038/s41380-025-03145-3.

Uncovering the multivariate genetic architecture of frailty with genomic structural equation modeling.

Nat Genet. 2025 Aug 4. doi: 10.1038/s41588-025-02269-0.

Enhanced genetic fine mapping accuracy with Bayesian Linear Regression models in diverse genetic architectures.

PLoS Genet. 2025 Jul 30;21(7):e1011783. doi: 10.1371/journal.pgen.1011783. eCollection 2025 Jul.

Disentangling the comorbidity between allergic disease and type 1 diabetes using genetically informative designs.

J Allergy Clin Immunol Glob. 2025 Jun 23;4(4):100519. doi: 10.1016/j.jacig.2025.100519. eCollection 2025 Nov.

The association of a polygenic lifespan score with the risk of common age-related diseases and mortality.

J Gerontol A Biol Sci Med Sci. 2025 Aug 23;80(9). doi: 10.1093/gerona/glaf156.

Genome-wide association meta-regression identifies stem cell lineage orchestration as a key driver of acne risk.

medRxiv. 2025 Jun 28:2025.06.27.25330406. doi: 10.1101/2025.06.27.25330406.

PGSFusion streamlines polygenic score construction and epidemiological applications in biobank-scale cohorts.

Genome Med. 2025 Jul 14;17(1):77. doi: 10.1186/s13073-025-01505-w.

本文引用的文献

SumHer better estimates the SNP heritability of complex traits from summary statistics.

Nat Genet. 2019 Feb;51(2):277-284. doi: 10.1038/s41588-018-0279-5. Epub 2018 Dec 3.

The UK Biobank resource with deep phenotyping and genomic data.

Nature. 2018 Oct;562(7726):203-209. doi: 10.1038/s41586-018-0579-z. Epub 2018 Oct 10.

Accurate Genomic Prediction of Human Height.

Genetics. 2018 Oct;210(2):477-497. doi: 10.1534/genetics.118.301267. Epub 2018 Aug 27.

Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry.

Hum Mol Genet. 2018 Oct 15;27(20):3641-3649. doi: 10.1093/hmg/ddy271.

Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits.

Nat Genet. 2018 Sep;50(9):1318-1326. doi: 10.1038/s41588-018-0193-x. Epub 2018 Aug 13.

Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals.

Nat Genet. 2018 Jul 23;50(8):1112-1121. doi: 10.1038/s41588-018-0147-3.

Estimating SNP-Based Heritability and Genetic Correlation in Case-Control Studies Directly and with Summary Statistics.

Am J Hum Genet. 2018 Jul 5;103(1):89-99. doi: 10.1016/j.ajhg.2018.06.002.

Mixed-model association for biobank-scale datasets.

Nat Genet. 2018 Jul;50(7):906-908. doi: 10.1038/s41588-018-0144-6.

The personal and clinical utility of polygenic risk scores.

Nat Rev Genet. 2018 Sep;19(9):581-590. doi: 10.1038/s41576-018-0018-x.

Signatures of negative selection in the genetic architecture of human complex traits.

Nat Genet. 2018 May;50(5):746-753. doi: 10.1038/s41588-018-0101-4. Epub 2018 Apr 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于汇总统计数据的贝叶斯多元回归提高多基因预测能力。

Improved polygenic prediction by Bayesian multiple regression on summary statistics.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献