• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

快速方差分析:一种用于全基因组关联研究的高效算法。

FastANOVA: an Efficient Algorithm for Genome-Wide Association Study.

作者信息

Zhang Xiang, Zou Fei, Wang Wei

机构信息

Department of Computer Science, University of North Carolina at Chapel Hill.

出版信息

KDD. 2008:821-829.

PMID:20945829
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2951741/
Abstract

Studying the association between quantitative phenotype (such as height or weight) and single nucleotide polymorphisms (SNPs) is an important problem in biology. To understand underlying mechanisms of complex phenotypes, it is often necessary to consider joint genetic effects across multiple SNPs. ANOVA (analysis of variance) test is routinely used in association study. Important findings from studying gene-gene (SNP-pair) interactions are appearing in the literature. However, the number of SNPs can be up to millions. Evaluating joint effects of SNPs is a challenging task even for SNP-pairs. Moreover, with large number of SNPs correlated, permutation procedure is preferred over simple Bonferroni correction for properly controlling family-wise error rate and retaining mapping power, which dramatically increases the computational cost of association study.In this paper, we study the problem of finding SNP-pairs that have significant associations with a given quantitative phenotype. We propose an efficient algorithm, FastANOVA, for performing ANOVA tests on SNP-pairs in a batch mode, which also supports large permutation test. We derive an upper bound of SNP-pair ANOVA test, which can be expressed as the sum of two terms. The first term is based on single-SNP ANOVA test. The second term is based on the SNPs and independent of any phenotype permutation. Furthermore, SNP-pairs can be organized into groups, each of which shares a common upper bound. This allows for maximum reuse of intermediate computation, efficient upper bound estimation, and effective SNP-pair pruning. Consequently, FastANOVA only needs to perform the ANOVA test on a small number of candidate SNP-pairs without the risk of missing any significant ones. Extensive experiments demonstrate that FastANOVA is orders of magnitude faster than the brute-force implementation of ANOVA tests on all SNP pairs.

摘要

研究定量表型(如身高或体重)与单核苷酸多态性(SNP)之间的关联是生物学中的一个重要问题。为了理解复杂表型的潜在机制,通常需要考虑多个SNP的联合遗传效应。方差分析(ANOVA)测试在关联研究中经常使用。研究基因-基因(SNP对)相互作用的重要发现不断出现在文献中。然而,SNP的数量可能多达数百万个。即使对于SNP对,评估SNP的联合效应也是一项具有挑战性的任务。此外,由于大量SNP之间存在相关性,与简单的Bonferroni校正相比,置换程序更适合用于正确控制家族性错误率并保留定位能力,这大大增加了关联研究的计算成本。在本文中,我们研究了寻找与给定定量表型具有显著关联的SNP对的问题。我们提出了一种高效算法FastANOVA,用于批量对SNP对进行ANOVA测试,该算法还支持大型置换测试。我们推导了SNP对ANOVA测试的一个上界,它可以表示为两项之和。第一项基于单SNP ANOVA测试。第二项基于SNP且与任何表型置换无关。此外,SNP对可以组织成组,每个组共享一个共同的上界。这允许最大程度地重用中间计算、高效的上界估计和有效的SNP对修剪。因此,FastANOVA只需要对少量候选SNP对进行ANOVA测试,而不会有遗漏任何显著SNP对的风险。大量实验表明,FastANOVA比在所有SNP对上进行ANOVA测试所需的暴力实现快几个数量级。

相似文献

1
FastANOVA: an Efficient Algorithm for Genome-Wide Association Study.快速方差分析:一种用于全基因组关联研究的高效算法。
KDD. 2008:821-829.
2
FastChi: an efficient algorithm for analyzing gene-gene interactions.FastChi:一种用于分析基因-基因相互作用的高效算法。
Pac Symp Biocomput. 2009:528-39.
3
COE: a general approach for efficient genome-wide two-locus epistasis test in disease association study.COE:疾病关联研究中全基因组双位点上位性检验的一种有效通用方法。
J Comput Biol. 2010 Mar;17(3):401-15. doi: 10.1089/cmb.2009.0155.
4
Gene-Gene Interactions Detection Using a Two-stage Model.使用两阶段模型检测基因-基因相互作用
J Comput Biol. 2015 Jun;22(6):563-76. doi: 10.1089/cmb.2014.0163. Epub 2015 Apr 14.
5
A comparative study on the unified model based multifactor dimensionality reduction methods for identifying gene-gene interactions associated with the survival phenotype.基于统一模型的多因素降维方法识别与生存表型相关的基因-基因相互作用的比较研究。
BioData Min. 2021 Mar 1;14(1):17. doi: 10.1186/s13040-021-00248-9.
6
Tools for efficient epistasis detection in genome-wide association study.全基因组关联研究中高效上位性检测工具
Source Code Biol Med. 2011 Jan 4;6(1):1. doi: 10.1186/1751-0473-6-1.
7
A multi-SNP association test for complex diseases incorporating an optimal P-value threshold algorithm in nuclear families.一种在核心家庭中纳入最优P值阈值算法的复杂疾病多单核苷酸多态性关联测试。
BMC Genomics. 2015 May 15;16(1):381. doi: 10.1186/s12864-015-1620-3.
8
Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks.用于全基因组单核苷酸多态性(SNP)选择和SNP网络构建的缩减方法。
BMC Syst Biol. 2010 Sep 13;4 Suppl 2(Suppl 2):S5. doi: 10.1186/1752-0509-4-S2-S5.
9
An Exhaustive Scan Method for SNP Main Effects and SNP × SNP Interactions Over Highly Homozygous Genomes.一种针对高度纯合基因组中SNP主效应和SNP×SNP相互作用的详尽扫描方法。
J Comput Biol. 2017 Dec;24(12):1254-1264. doi: 10.1089/cmb.2017.0140. Epub 2017 Nov 3.
10
Prioritizing genetic variants in GWAS with lasso using permutation-assisted tuning.使用排列辅助调优的lasso 优先考虑 GWAS 中的遗传变异。
Bioinformatics. 2020 Jun 1;36(12):3811-3817. doi: 10.1093/bioinformatics/btaa229.

引用本文的文献

1
Evaluation of epistasis detection methods for quantitative phenotypes.数量性状上位性检测方法的评估
bioRxiv. 2025 May 14:2025.04.30.651312. doi: 10.1101/2025.04.30.651312.
2
EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm.EpiMOGA:一种基于多目标遗传算法的上位性检测方法。
Genes (Basel). 2021 Jan 28;12(2):191. doi: 10.3390/genes12020191.
3
Epi-GTBN: an approach of epistasis mining based on genetic Tabu algorithm and Bayesian network.Epi-GTBN:一种基于遗传禁忌搜索算法和贝叶斯网络的上位性挖掘方法。
BMC Bioinformatics. 2019 Aug 28;20(1):444. doi: 10.1186/s12859-019-3022-z.
4
The early transcriptome response of cassava (Manihot esculenta Crantz) to mealybug (Phenacoccus manihoti) feeding.木薯(Manihot esculenta Crantz)对棉铃虫(Phenacoccus manihoti)取食的早期转录组反应。
PLoS One. 2018 Aug 22;13(8):e0202541. doi: 10.1371/journal.pone.0202541. eCollection 2018.
5
The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation.全基因组关联研究中基因-基因相互作用的探索:方法众多带来的挑战、实际考量及生物学解释
Ann Transl Med. 2018 Apr;6(8):157. doi: 10.21037/atm.2018.04.05.
6
An Efficient Nonlinear Regression Approach for Genome-wide Detection of Marginal and Interacting Genetic Variations.一种用于全基因组检测边缘和相互作用遗传变异的高效非线性回归方法。
J Comput Biol. 2016 May;23(5):372-89. doi: 10.1089/cmb.2015.0202.
7
A cautionary note on the impact of protocol changes for genome-wide association SNP × SNP interaction studies: an example on ankylosing spondylitis.关于全基因组关联研究中SNP×SNP相互作用研究方案变更影响的警示说明:以强直性脊柱炎为例
Hum Genet. 2015 Jul;134(7):761-73. doi: 10.1007/s00439-015-1560-7. Epub 2015 May 5.
8
SubPatCNV: approximate subspace pattern mining for mapping copy-number variations.SubPatCNV:用于映射拷贝数变异的近似子空间模式挖掘
BMC Bioinformatics. 2015 Jan 16;16:16. doi: 10.1186/s12859-014-0426-7.
9
GWIS--model-free, fast and exhaustive search for epistatic interactions in case-control GWAS.GWIS--无模型、快速且全面搜索病例对照 GWAS 中的上位相互作用。
BMC Genomics. 2013;14 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2164-14-S3-S10. Epub 2013 May 28.
10
eQTL Epistasis - Challenges and Computational Approaches.eQTL 上位性 - 挑战与计算方法。
Front Genet. 2013 May 31;4:51. doi: 10.3389/fgene.2013.00051. eCollection 2013.

本文引用的文献

1
A common variant of HMGA2 is associated with adult and childhood height in the general population.HMGA2的一种常见变体与普通人群的成人和儿童身高相关。
Nat Genet. 2007 Oct;39(10):1245-50. doi: 10.1038/ng2121. Epub 2007 Sep 2.
2
Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits.全基因组关联扫描显示,FTO基因中的遗传变异与肥胖相关性状有关。
PLoS Genet. 2007 Jul;3(7):e115. doi: 10.1371/journal.pgen.0030115.
3
Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows.通过在滑动窗口上进行快速最近邻搜索来推断大型单核苷酸多态性(SNP)面板中缺失的基因型。
Bioinformatics. 2007 Jul 1;23(13):i401-7. doi: 10.1093/bioinformatics/btm220.
4
Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels.全基因组关联分析确定2型糖尿病和甘油三酯水平的基因座。
Science. 2007 Jun 1;316(5829):1331-6. doi: 10.1126/science.1142358. Epub 2007 Apr 26.
5
Two-stage two-locus models in genome-wide association.全基因组关联研究中的两阶段双基因座模型
PLoS Genet. 2006 Sep 22;2(9):e157. doi: 10.1371/journal.pgen.0020157.
6
A tutorial on statistical methods for population association studies.群体关联研究统计方法教程。
Nat Rev Genet. 2006 Oct;7(10):781-91. doi: 10.1038/nrg1916.
7
Genetic variation in laboratory mice.实验小鼠的基因变异
Nat Genet. 2005 Nov;37(11):1175-80. doi: 10.1038/ng1666.
8
Tag SNP selection in genotype data for maximizing SNP prediction accuracy.在基因型数据中选择标签单核苷酸多态性以最大化单核苷酸多态性预测准确性。
Bioinformatics. 2005 Jun;21 Suppl 1:i195-203. doi: 10.1093/bioinformatics/bti1021.
9
Modular epistasis in yeast metabolism.酵母代谢中的模块化上位性
Nat Genet. 2005 Jan;37(1):77-83. doi: 10.1038/ng1489. Epub 2004 Dec 12.
10
Genetic and haplotype diversity among wild-derived mouse inbred strains.野生来源小鼠近交系的遗传和单倍型多样性。
Genome Res. 2004 Oct;14(10A):1880-7. doi: 10.1101/gr.2519704.