• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

连锁不平衡检测方法在半模拟 GWAS 中的性能。

Performance of epistasis detection methods in semi-simulated GWAS.

机构信息

SANOFI R&D, Translational Sciences, Chilly Mazarin, 91385, France.

Laboratoire de Probabilités et Modèles Aléatoires, Université Pierre et Marie Curie, 4, place Jussieu, Paris Cedex 05, 75252, France.

出版信息

BMC Bioinformatics. 2018 Jun 18;19(1):231. doi: 10.1186/s12859-018-2229-8.

DOI:10.1186/s12859-018-2229-8
PMID:29914375
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6006572/
Abstract

BACKGROUND

Part of the missing heritability in Genome Wide Association Studies (GWAS) is expected to be explained by interactions between genetic variants, also called epistasis. Various statistical methods have been developed to detect epistasis in case-control GWAS. These methods face major statistical challenges due to the number of tests required, the complexity of the Linkage Disequilibrium (LD) structure, and the lack of consensus regarding the definition of epistasis. Their limited impact in terms of uncovering new biological knowledge might be explained in part by the limited amount of experimental data available to validate their statistical performances in a realistic GWAS context. In this paper, we introduce a simulation pipeline for generating real scale GWAS data, including epistasis and realistic LD structure. We evaluate five exhaustive bivariate interaction methods, fastepi, GBOOST, SHEsisEpi, DSS, and IndOR. Two hundred thirty four different disease scenarios are considered in extensive simulations. We report the performances of each method in terms of false positive rate control, power, area under the ROC curve (AUC), and computation time using a GPU. Finally we compare the result of each methods on a real GWAS of type 2 diabetes from the Welcome Trust Case Control Consortium.

RESULTS

GBOOST, SHEsisEpi and DSS allow a satisfactory control of the false positive rate. fastepi and IndOR present an increase in false positive rate in presence of LD between causal SNPs, with our definition of epistasis. DSS performs best in terms of power and AUC in most scenarios with no or weak LD between causal SNPs. All methods can exhaustively analyze a GWAS with 6.10 SNPs and 15,000 samples in a couple of hours using a GPU.

CONCLUSION

This study confirms that computation time is no longer a limiting factor for performing an exhaustive search of epistasis in large GWAS. For this task, using DSS on SNP pairs with limited LD seems to be a good strategy to achieve the best statistical performance. A combination approach using both DSS and GBOOST is supported by the simulation results and the analysis of the WTCCC dataset demonstrated that this approach can detect distinct genes in epistasis. Finally, weak epistasis between common variants will be detectable with existing methods when GWAS of a few tens of thousands cases and controls are available.

摘要

背景

全基因组关联研究(GWAS)中部分遗传易感性缺失预计可以通过遗传变异之间的相互作用来解释,这种相互作用也被称为上位性。已经开发了各种统计方法来检测病例对照 GWAS 中的上位性。由于需要进行的测试数量众多、连锁不平衡(LD)结构的复杂性以及关于上位性定义的共识缺乏,这些方法面临着重大的统计挑战。它们在揭示新的生物学知识方面的影响有限,部分原因可能是由于缺乏可用的实验数据来验证它们在真实 GWAS 背景下的统计性能。在本文中,我们引入了一个用于生成真实规模 GWAS 数据的模拟管道,包括上位性和真实 LD 结构。我们评估了五种全面的双变量相互作用方法,即 fastepi、GBOOST、SHEsisEpi、DSS 和 IndOR。在广泛的模拟中考虑了 234 种不同的疾病情况。我们报告了每种方法在假阳性率控制、功效、ROC 曲线下面积(AUC)和使用 GPU 的计算时间方面的性能。最后,我们将每种方法在来自 Welcome Trust Case Control Consortium 的 2 型糖尿病真实 GWAS 上的结果进行了比较。

结果

GBOOST、SHEsisEpi 和 DSS 可以令人满意地控制假阳性率。fastepi 和 IndOR 在我们定义的上位性存在因果 SNP 之间的 LD 时,假阳性率会增加。在因果 SNP 之间没有或弱 LD 的大多数情况下,DSS 在功效和 AUC 方面表现最佳。所有方法都可以在几个小时内使用 GPU 对具有 6.10 个 SNP 和 15000 个样本的 GWAS 进行全面分析。

结论

这项研究证实,计算时间不再是在大型 GWAS 中进行上位性全面搜索的限制因素。对于这项任务,在 SNP 对之间使用 DSS 并且 LD 有限似乎是实现最佳统计性能的一种好策略。模拟结果支持 DSS 和 GBOOST 的组合方法,WTCCC 数据集的分析表明,这种方法可以检测到上位性中的不同基因。最后,当有几万个病例和对照的 GWAS 时,现有方法可以检测到常见变体之间较弱的上位性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/6413a3632045/12859_2018_2229_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/57ba85da99e5/12859_2018_2229_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/512fb7a244df/12859_2018_2229_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/8833af9ec800/12859_2018_2229_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/a0daa0ed496f/12859_2018_2229_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/b5f6dbdaae0a/12859_2018_2229_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/21e599b2f140/12859_2018_2229_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/3932a3350008/12859_2018_2229_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/466e8ea8e8cd/12859_2018_2229_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/6413a3632045/12859_2018_2229_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/57ba85da99e5/12859_2018_2229_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/512fb7a244df/12859_2018_2229_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/8833af9ec800/12859_2018_2229_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/a0daa0ed496f/12859_2018_2229_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/b5f6dbdaae0a/12859_2018_2229_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/21e599b2f140/12859_2018_2229_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/3932a3350008/12859_2018_2229_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/466e8ea8e8cd/12859_2018_2229_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d5b/6006572/6413a3632045/12859_2018_2229_Fig9_HTML.jpg

相似文献

1
Performance of epistasis detection methods in semi-simulated GWAS.连锁不平衡检测方法在半模拟 GWAS 中的性能。
BMC Bioinformatics. 2018 Jun 18;19(1):231. doi: 10.1186/s12859-018-2229-8.
2
GWIS--model-free, fast and exhaustive search for epistatic interactions in case-control GWAS.GWIS--无模型、快速且全面搜索病例对照 GWAS 中的上位相互作用。
BMC Genomics. 2013;14 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2164-14-S3-S10. Epub 2013 May 28.
3
IndOR: a new statistical procedure to test for SNP-SNP epistasis in genome-wide association studies.IndOR:一种用于全基因组关联研究中 SNP-SNP 互作检验的新统计方法。
Stat Med. 2012 Sep 20;31(21):2359-73. doi: 10.1002/sim.5364. Epub 2012 Jun 18.
4
Development of GMDR-GPU for gene-gene interaction analysis and its application to WTCCC GWAS data for type 2 diabetes.开发 GMDR-GPU 进行基因-基因相互作用分析及其在 WTCCC GWAS 数据中 2 型糖尿病的应用。
PLoS One. 2013 Apr 23;8(4):e61943. doi: 10.1371/journal.pone.0061943. Print 2013.
5
iLOCi: a SNP interaction prioritization technique for detecting epistasis in genome-wide association studies.iLOCi:一种 SNP 相互作用优先级技术,用于检测全基因组关联研究中的上位性。
BMC Genomics. 2012;13 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2164-13-S7-S2. Epub 2012 Dec 13.
6
High-throughput analysis of epistasis in genome-wide association studies with BiForce.利用 BiForce 进行全基因组关联研究中的上位性的高通量分析。
Bioinformatics. 2012 Aug 1;28(15):1957-64. doi: 10.1093/bioinformatics/bts304. Epub 2012 May 21.
7
Prioritizing tests of epistasis through hierarchical representation of genomic redundancies.通过基因组冗余的分层表示对上位性测试进行优先级排序。
Nucleic Acids Res. 2017 Aug 21;45(14):e131. doi: 10.1093/nar/gkx505.
8
Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data.从全基因组关联数据中找到 2 型糖尿病因果单核苷酸多态性组合和功能模块。
BMC Med Inform Decis Mak. 2013;13 Suppl 1(Suppl 1):S3. doi: 10.1186/1472-6947-13-S1-S3. Epub 2013 Apr 5.
9
The complete compositional epistasis detection in genome-wide association studies.全基因组关联研究中的完全组成性上位性检测。
BMC Genet. 2013 Feb 19;14:7. doi: 10.1186/1471-2156-14-7.
10
A whole-genome simulator capable of modeling high-order epistasis for complex disease.一种能够对复杂疾病进行高阶上位性建模的全基因组模拟器。
Genet Epidemiol. 2013 Nov;37(7):686-94. doi: 10.1002/gepi.21761. Epub 2013 Oct 1.

引用本文的文献

1
Evaluation of epistasis detection methods for quantitative phenotypes.数量性状上位性检测方法的评估
bioRxiv. 2025 May 14:2025.04.30.651312. doi: 10.1101/2025.04.30.651312.
2
Next-Gen GWAS: full 2D epistatic interaction maps retrieve part of missing heritability and improve phenotypic prediction.下一代 GWAS:全二维上位性互作图谱可获取部分缺失的遗传力并提高表型预测能力。
Genome Biol. 2024 Mar 25;25(1):76. doi: 10.1186/s13059-024-03202-0.
3
BridGE: a pathway-based analysis tool for detecting genetic interactions from GWAS.BridGE:一种基于通路的分析工具,用于从 GWAS 中检测遗传相互作用。

本文引用的文献

1
Trispecific broadly neutralizing HIV antibodies mediate potent SHIV protection in macaques.三特异性广泛中和HIV抗体在猕猴中介导对猿猴免疫缺陷病毒的有效保护。
Science. 2017 Oct 6;358(6359):85-90. doi: 10.1126/science.aan8630. Epub 2017 Sep 20.
2
Heart Failure with Preserved Ejection Fraction: Entresto a Possible Option.射血分数保留的心力衰竭:Entresto 可能是一种选择。
Cardiovasc Hematol Disord Drug Targets. 2017;17(2):80-85. doi: 10.2174/1871529X17666170703120237.
3
Eigen-Epistasis for detecting gene-gene interactions.用于检测基因-基因相互作用的特征上位性
Nat Protoc. 2024 May;19(5):1400-1435. doi: 10.1038/s41596-024-00954-8. Epub 2024 Mar 21.
4
Evaluating the detection ability of a range of epistasis detection methods on simulated data for pure and impure epistatic models.评估一系列上位性检测方法在纯上位性模型和不纯上位性模型的模拟数据中的检测能力。
PLoS One. 2022 Feb 18;17(2):e0263390. doi: 10.1371/journal.pone.0263390. eCollection 2022.
5
Bench Research Informed by GWAS Results.基于全基因组关联研究结果的实验室研究
Cells. 2021 Nov 15;10(11):3184. doi: 10.3390/cells10113184.
6
MIDESP: Mutual Information-Based Detection of Epistatic SNP Pairs for Qualitative and Quantitative Phenotypes.MIDESP:基于互信息的定性和定量表型上位性SNP对检测
Biology (Basel). 2021 Sep 16;10(9):921. doi: 10.3390/biology10090921.
7
A new method for exploring gene-gene and gene-environment interactions in GWAS with tree ensemble methods and SHAP values.基于树集成方法和 SHAP 值的 GWAS 中基因-基因和基因-环境相互作用的新探索方法。
BMC Bioinformatics. 2021 May 4;22(1):230. doi: 10.1186/s12859-021-04041-7.
8
A Bioinformatics Crash Course for Interpreting Genomics Data.生物信息学速成课程:解读基因组学数据。
Chest. 2020 Jul;158(1S):S113-S123. doi: 10.1016/j.chest.2020.03.004.
BMC Bioinformatics. 2017 Jan 23;18(1):54. doi: 10.1186/s12859-017-1488-0.
4
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog).新的NHGRI-EBI已发表全基因组关联研究目录(GWAS目录)。
Nucleic Acids Res. 2017 Jan 4;45(D1):D896-D901. doi: 10.1093/nar/gkw1133. Epub 2016 Nov 29.
5
A global test for gene-gene interactions based on random matrix theory.基于随机矩阵理论的基因-基因相互作用全局检验。
Genet Epidemiol. 2016 Dec;40(8):689-701. doi: 10.1002/gepi.21990. Epub 2016 Jul 7.
6
Epistasis Test in Meta-Analysis: A Multi-Parameter Markov Chain Monte Carlo Model for Consistency of Evidence.荟萃分析中的上位性检验:一种用于证据一致性的多参数马尔可夫链蒙特卡罗模型
PLoS One. 2016 Apr 5;11(4):e0152891. doi: 10.1371/journal.pone.0152891. eCollection 2016.
7
AGGrEGATOr: A Gene-based GEne-Gene interActTiOn test for case-control association studies.AGGrEGATOr:一种用于病例对照关联研究的基于基因的基因-基因相互作用测试。
Stat Appl Genet Mol Biol. 2016 Apr;15(2):151-71. doi: 10.1515/sagmb-2015-0074.
8
Global epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of prevalence, incidence, and outcomes.全球非酒精性脂肪性肝病流行病学——患病率、发病率和结局的荟萃分析评估。
Hepatology. 2016 Jul;64(1):73-84. doi: 10.1002/hep.28431. Epub 2016 Feb 22.
9
GWAS as a Driver of Gene Discovery in Cardiometabolic Diseases.GWAS 作为心脏代谢疾病基因发现的驱动力。
Trends Endocrinol Metab. 2015 Dec;26(12):722-732. doi: 10.1016/j.tem.2015.10.004. Epub 2015 Nov 18.
10
A survey about methods dedicated to epistasis detection.一项关于用于上位性检测方法的调查。
Front Genet. 2015 Sep 10;6:285. doi: 10.3389/fgene.2015.00285. eCollection 2015.