用于检测全基因组关联研究中基因-基因相互作用的零分布选择。

The choice of null distributions for detecting gene-gene interactions in genome-wide association studies.

机构信息

Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong.

出版信息

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S26. doi: 10.1186/1471-2105-12-S1-S26.

DOI:10.1186/1471-2105-12-S1-S26

PMID:21342556

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3044281/

Abstract

BACKGROUND

In genome-wide association studies (GWAS), the number of single-nucleotide polymorphisms (SNPs) typically ranges between 500,000 and 1,000,000. Accordingly, detecting gene-gene interactions in GWAS is computationally challenging because it involves hundreds of billions of SNP pairs. Stage-wise strategies are often used to overcome the computational difficulty. In the first stage, fast screening methods (e.g. Tuning ReliefF) are applied to reduce the whole SNP set to a small subset. In the second stage, sophisticated modeling methods (e.g., multifactor-dimensionality reduction (MDR)) are applied to the subset of SNPs to identify interesting interaction models and the corresponding interaction patterns. In the third stage, the significance of the identified interaction patterns is evaluated by hypothesis testing.

RESULTS

In this paper, we show that this stage-wise strategy could be problematic in controlling the false positive rate if the null distribution is not appropriately chosen. This is because screening and modeling may change the null distribution used in hypothesis testing. In our simulation study, we use some popular screening methods and the popular modeling method MDR as examples to show the effect of the inappropriate choice of null distributions. To choose appropriate null distributions, we suggest to use the permutation test or testing on the independent data set. We demonstrate their performance using synthetic data and a real genome wide data set from an Aged-related Macular Degeneration (AMD) study.

CONCLUSIONS

The permutation test or testing on the independent data set can help choosing appropriate null distributions in hypothesis testing, which provides more reliable results in practice.

摘要

背景

在全基因组关联研究（GWAS）中，单核苷酸多态性（SNP）的数量通常在 50 万到 100 万之间。因此，GWAS 中检测基因-基因相互作用在计算上具有挑战性，因为它涉及到数万亿个 SNP 对。分阶段策略通常用于克服计算困难。在第一阶段，快速筛选方法（例如 Tuning ReliefF）被应用于将整个 SNP 集缩小到一个小的子集。在第二阶段，复杂的建模方法（例如多因素降维（MDR））被应用于 SNP 子集，以识别有趣的相互作用模型和相应的相互作用模式。在第三阶段，通过假设检验评估所识别的相互作用模式的显著性。

结果

在本文中，我们表明，如果未适当选择零假设分布，这种分阶段策略可能会在控制假阳性率方面存在问题。这是因为筛选和建模可能会改变假设检验中使用的零假设分布。在我们的模拟研究中，我们使用一些流行的筛选方法和流行的建模方法 MDR 作为示例，展示了零假设分布选择不当的影响。为了选择适当的零假设分布，我们建议使用置换检验或独立数据集检验。我们使用合成数据和来自年龄相关性黄斑变性（AMD）研究的真实全基因组数据集演示了它们的性能。

结论

置换检验或独立数据集检验可以帮助在假设检验中选择适当的零假设分布，从而在实践中提供更可靠的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45a2/3044281/cacc94959b62/1471-2105-12-S1-S26-1.jpg

相似文献

The choice of null distributions for detecting gene-gene interactions in genome-wide association studies.

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S26. doi: 10.1186/1471-2105-12-S1-S26.

DualWMDR: Detecting epistatic interaction with dual screening and multifactor dimensionality reduction.

Hum Mutat. 2020 Mar;41(3):719-734. doi: 10.1002/humu.23951. Epub 2019 Nov 25.

A novel method to identify high order gene-gene interactions in genome-wide association studies: gene-based MDR.

BMC Bioinformatics. 2012 Jun 11;13 Suppl 9(Suppl 9):S5. doi: 10.1186/1471-2105-13-S9-S5.

A novel survival multifactor dimensionality reduction method for detecting gene-gene interactions with application to bladder cancer prognosis.

Hum Genet. 2011 Jan;129(1):101-10. doi: 10.1007/s00439-010-0905-5. Epub 2010 Oct 28.

cuGWAM: Genome-wide association multifactor dimensionality reduction using CUDA-enabled high-performance graphics processing unit.

Int J Data Min Bioinform. 2012;6(5):471-81. doi: 10.1504/ijdmb.2012.049301.

An empirical fuzzy multifactor dimensionality reduction method for detecting gene-gene interactions.

BMC Genomics. 2017 Mar 14;18(Suppl 2):115. doi: 10.1186/s12864-017-3496-x.

A unified model based multifactor dimensionality reduction framework for detecting gene-gene interactions.

Bioinformatics. 2016 Sep 1;32(17):i605-i610. doi: 10.1093/bioinformatics/btw424.

Spatial rank-based multifactor dimensionality reduction to detect gene-gene interactions for multivariate phenotypes.

BMC Bioinformatics. 2021 Oct 4;22(1):480. doi: 10.1186/s12859-021-04395-y.

A new efficient method to detect genetic interactions for lung cancer GWAS.

BMC Med Genomics. 2020 Oct 30;13(1):162. doi: 10.1186/s12920-020-00807-9.

CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies.

Bioinformatics. 2017 Aug 1;33(15):2354-2362. doi: 10.1093/bioinformatics/btx163.

引用本文的文献

Detecting epistasis in human complex traits.

Nat Rev Genet. 2014 Nov;15(11):722-33. doi: 10.1038/nrg3747. Epub 2014 Sep 9.

Risk score modeling of multiple gene to gene interactions using aggregated-multifactor dimensionality reduction.

BioData Min. 2013 Jan 8;6(1):1. doi: 10.1186/1756-0381-6-1.

Hypothesis-based analysis of gene-gene interactions and risk of myocardial infarction.

PLoS One. 2012;7(8):e41730. doi: 10.1371/journal.pone.0041730. Epub 2012 Aug 2.

Risk estimation and risk prediction using machine-learning methods.

Hum Genet. 2012 Oct;131(10):1639-54. doi: 10.1007/s00439-012-1194-y. Epub 2012 Jul 3.

本文引用的文献

BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies.

Am J Hum Genet. 2010 Sep 10;87(3):325-40. doi: 10.1016/j.ajhg.2010.07.021.

Missing heritability and strategies for finding the underlying causes of complex disease.

Nat Rev Genet. 2010 Jun;11(6):446-50. doi: 10.1038/nrg2809.

Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso.

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S18. doi: 10.1186/1471-2105-11-S1-S18.

Bioinformatics challenges for genome-wide association studies.

Bioinformatics. 2010 Feb 15;26(4):445-55. doi: 10.1093/bioinformatics/btp713. Epub 2010 Jan 6.

Application of seventeen two-locus models in genome-wide association studies by two-stage strategy.

BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S26. doi: 10.1186/1753-6561-3-s7-s26.

Predictive rule inference for epistatic interaction detection in genome-wide association studies.

Bioinformatics. 2010 Jan 1;26(1):30-7. doi: 10.1093/bioinformatics/btp622. Epub 2009 Oct 30.

INTERSNP: genome-wide interaction analysis guided by a priori information.

Bioinformatics. 2009 Dec 15;25(24):3275-81. doi: 10.1093/bioinformatics/btp596. Epub 2009 Oct 16.

Spatially uniform relieff (SURF) for computationally-efficient filtering of gene-gene interactions.

BioData Min. 2009 Sep 22;2(1):5. doi: 10.1186/1756-0381-2-5.

Detecting gene-gene interactions that underlie human diseases.

Nat Rev Genet. 2009 Jun;10(6):392-404. doi: 10.1038/nrg2579.

Genome-wide association analysis by lasso penalized logistic regression.

Bioinformatics. 2009 Mar 15;25(6):714-21. doi: 10.1093/bioinformatics/btp041. Epub 2009 Jan 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于检测全基因组关联研究中基因-基因相互作用的零分布选择。

The choice of null distributions for detecting gene-gene interactions in genome-wide association studies.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献