LSMM：一种将功能注释与全基因组关联研究相结合的统计方法。

LSMM: a statistical approach to integrating functional annotations with genome-wide association studies.

机构信息

Department of Mathematics, Hong Kong Baptist University, Hong Kong.

School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, China.

出版信息

Bioinformatics. 2018 Aug 15;34(16):2788-2796. doi: 10.1093/bioinformatics/bty187.

DOI:10.1093/bioinformatics/bty187

PMID:29608640

Abstract

MOTIVATION

Thousands of risk variants underlying complex phenotypes (quantitative traits and diseases) have been identified in genome-wide association studies (GWAS). However, there are still two major challenges towards deepening our understanding of the genetic architectures of complex phenotypes. First, the majority of GWAS hits are in non-coding region and their biological interpretation is still unclear. Second, accumulating evidence from GWAS suggests the polygenicity of complex traits, i.e. a complex trait is often affected by many variants with small or moderate effects, whereas a large proportion of risk variants with small effects remain unknown.

RESULTS

The availability of functional annotation data enables us to address the above challenges. In this study, we propose a latent sparse mixed model (LSMM) to integrate functional annotations with GWAS data. Not only does it increase the statistical power of identifying risk variants, but also offers more biological insights by detecting relevant functional annotations. To allow LSMM scalable to millions of variants and hundreds of functional annotations, we developed an efficient variational expectation-maximization algorithm for model parameter estimation and statistical inference. We first conducted comprehensive simulation studies to evaluate the performance of LSMM. Then we applied it to analyze 30 GWAS of complex phenotypes integrated with nine genic category annotations and 127 cell-type specific functional annotations from the Roadmap project. The results demonstrate that our method possesses more statistical power than conventional methods, and can help researchers achieve deeper understanding of genetic architecture of these complex phenotypes.

AVAILABILITY AND IMPLEMENTATION

The LSMM software is available at https://github.com/mingjingsi/LSMM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在全基因组关联研究 (GWAS) 中已经确定了数千个复杂表型（定量性状和疾病）的风险变异。然而，在深入了解复杂表型的遗传结构方面仍然存在两个主要挑战。首先，大多数 GWAS 命中都在非编码区域，其生物学解释仍不清楚。其次，来自 GWAS 的累积证据表明复杂性状的多基因性，即复杂性状通常受到许多具有小或中等效应的变异的影响，而大量具有小效应的风险变异仍然未知。

结果

功能注释数据的可用性使我们能够解决上述挑战。在这项研究中，我们提出了一种潜在稀疏混合模型（LSMM），将功能注释与 GWAS 数据集成在一起。它不仅提高了识别风险变异的统计能力，而且通过检测相关的功能注释提供了更多的生物学见解。为了使 LSMM 能够扩展到数百万个变体和数百个功能注释，我们开发了一种有效的变分期望最大化算法来进行模型参数估计和统计推断。我们首先进行了全面的模拟研究，以评估 LSMM 的性能。然后，我们将其应用于分析 30 个复杂表型的 GWAS，这些表型与来自 Roadmap 项目的九个基因类别注释和 127 个细胞类型特异性功能注释集成在一起。结果表明，我们的方法比传统方法具有更高的统计能力，并有助于研究人员更深入地了解这些复杂表型的遗传结构。

可用性和实施

LSMM 软件可在 https://github.com/mingjingsi/LSMM 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

LSMM: a statistical approach to integrating functional annotations with genome-wide association studies.LSMM：一种将功能注释与全基因组关联研究相结合的统计方法。

Bioinformatics. 2018 Aug 15;34(16):2788-2796. doi: 10.1093/bioinformatics/bty187.

PALM: a powerful and adaptive latent model for prioritizing risk variants with functional annotations.PALM：一种强大且自适应的潜在模型，用于对具有功能注释的风险变异进行优先级排序。

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad068.

LPM: a latent probit model to characterize the relationship among complex traits using summary statistics from multiple GWASs and functional annotations.LPM：一种潜在概率模型，用于使用来自多个 GWAS 和功能注释的汇总统计信息来描述复杂性状之间的关系。

Bioinformatics. 2020 Apr 15;36(8):2506-2514. doi: 10.1093/bioinformatics/btz947.

Pleiotropic mapping and annotation selection in genome-wide association studies with penalized Gaussian mixture models.基于惩罚高斯混合模型的全基因组关联研究中的多效性映射和注释选择。

Bioinformatics. 2018 Aug 15;34(16):2797-2807. doi: 10.1093/bioinformatics/bty204.

LPG: A four-group probabilistic approach to leveraging pleiotropy in genome-wide association studies.LPG：一种在全基因组关联研究中利用多效性的四组概率方法。

BMC Genomics. 2018 Jun 28;19(1):503. doi: 10.1186/s12864-018-4851-2.

GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation.GPA：一种通过整合多效性和注释对全基因组关联研究结果进行优先级排序的统计方法。

PLoS Genet. 2014 Nov 13;10(11):e1004787. doi: 10.1371/journal.pgen.1004787. eCollection 2014 Nov.

LLR: a latent low-rank approach to colocalizing genetic risk variants in multiple GWAS.LLR：一种潜在的低秩方法，用于在多个 GWAS 中定位遗传风险变异。

Bioinformatics. 2017 Dec 15;33(24):3878-3886. doi: 10.1093/bioinformatics/btx512.

IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.IGESS：一种在全基因组关联研究中整合个体水平基因型数据和汇总统计数据的统计方法。

Bioinformatics. 2017 Sep 15;33(18):2882-2889. doi: 10.1093/bioinformatics/btx314.

GPA-Tree: statistical approach for functional-annotation-tree-guided prioritization of GWAS results.GPA-Tree：基于功能注释树引导的 GWAS 结果优先级排序的统计方法。

Bioinformatics. 2022 Jan 27;38(4):1067-1074. doi: 10.1093/bioinformatics/btab802.

Joint analysis of individual-level and summary-level GWAS data by leveraging pleiotropy.利用多效性对个体水平和汇总水平 GWAS 数据进行联合分析。

Bioinformatics. 2019 May 15;35(10):1729-1736. doi: 10.1093/bioinformatics/bty870.

引用本文的文献

Spleen volume in relation to ulcerative colitis and Crohn's disease: a Mendelian randomization study.脾脏体积与溃疡性结肠炎和克罗恩病的关系：一项孟德尔随机化研究

Sci Rep. 2025 Feb 24;15(1):6588. doi: 10.1038/s41598-025-90104-1.

Funmap: integrating high-dimensional functional annotations to improve fine-mapping.Funmap：整合高维功能注释以改进精细定位。

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btaf017.

Disease-specific prioritization of non-coding GWAS variants based on chromatin accessibility.基于染色质可及性的疾病特异性非编码 GWAS 变体优先级排序。

HGG Adv. 2024 Jul 18;5(3):100310. doi: 10.1016/j.xhgg.2024.100310. Epub 2024 May 21.

multi-GPA-Tree: Statistical approach for pleiotropy informed and functional annotation tree guided prioritization of GWAS results.多遗传风险评分树（multi-GPA-Tree）：一种基于统计方法的关联分析结果优先级排序策略，该策略考虑了遗传多效性信息，并采用功能注释树进行指导。

PLoS Comput Biol. 2023 Dec 7;19(12):e1011686. doi: 10.1371/journal.pcbi.1011686. eCollection 2023 Dec.

Evaluating 17 methods incorporating biological function with GWAS summary statistics to accelerate discovery demonstrates a tradeoff between high sensitivity and high positive predictive value.评估 17 种结合生物功能与 GWAS 汇总统计数据的方法以加速发现，结果表明高灵敏度和高阳性预测值之间存在权衡。

Commun Biol. 2023 Nov 24;6(1):1199. doi: 10.1038/s42003-023-05413-w.

XMAP: Cross-population fine-mapping by leveraging genetic diversity and accounting for confounding bias.XMAP：利用遗传多样性并考虑混杂偏差进行跨人群精细映射。

Nat Commun. 2023 Oct 28;14(1):6870. doi: 10.1038/s41467-023-42614-7.

graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data.Graph-GPA 2.0：通过整合功能注释数据改进多疾病遗传分析

Front Genet. 2023 Jul 12;14:1079198. doi: 10.3389/fgene.2023.1079198. eCollection 2023.

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad068.

Dissecting Complex Traits Using Omics Data: A Review on the Linear Mixed Models and Their Application in GWAS.利用组学数据剖析复杂性状：线性混合模型及其在全基因组关联研究中的应用综述

Plants (Basel). 2022 Nov 28;11(23):3277. doi: 10.3390/plants11233277.

Leveraging the local genetic structure for trans-ancestry association mapping.利用本地遗传结构进行跨种族关联映射。

Am J Hum Genet. 2022 Jul 7;109(7):1317-1337. doi: 10.1016/j.ajhg.2022.05.013. Epub 2022 Jun 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

LSMM：一种将功能注释与全基因组关联研究相结合的统计方法。

LSMM: a statistical approach to integrating functional annotations with genome-wide association studies.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实施

补充信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献