Suppr超能文献

强大的多标记关联测试:基于基因组距离的回归和逻辑回归的统一。

Powerful multi-marker association tests: unifying genomic distance-based regression and logistic regression.

机构信息

Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55455–0392, USA.

出版信息

Genet Epidemiol. 2010 Nov;34(7):680-8. doi: 10.1002/gepi.20529.

Abstract

To detect genetic association with common and complex diseases, many statistical tests have been proposed for candidate gene or genome-wide association studies with the case-control design. Due to linkage disequilibrium (LD), multi-marker association tests can gain power over single-marker tests with a Bonferroni multiple testing adjustment. Among many existing multi-marker association tests, most target to detect only one of many possible aspects in distributional differences between the genotypes of cases and controls, such as allele frequency differences, while a few new ones aim to target two or three aspects, all of which can be implemented in logistic regression. In contrast to logistic regression, a genomic distance-based regression (GDBR) approach aims to detect some high-order genotypic differences between cases and controls. A recent study has confirmed the high power of GDBR tests. At this moment, the popular logistic regression and the emerging GDBR approaches are completely unrelated; for example, one has to choose between the two. In this article, we reformulate GDBR as logistic regression, opening a venue to constructing other powerful tests while overcoming some limitations of GDBR. For example, asymptotic distributions can replace time-consuming permutations for deriving P-values and covariates, including gene-gene interactions, can be easily incorporated. Importantly, this reformulation facilitates combining GDBR with other existing methods in a unified framework of logistic regression. In particular, we show that Fisher's P-value combining method can boost statistical power by incorporating information from allele frequencies, Hardy-Weinberg disequilibrium, LD patterns, and other higher-order interactions among multi-markers as captured by GDBR.

摘要

为了检测常见和复杂疾病的遗传关联,已经提出了许多统计测试方法,用于基于病例对照设计的候选基因或全基因组关联研究。由于连锁不平衡(LD),多标记关联测试可以通过 Bonferroni 多重测试调整获得比单标记测试更强的功效。在许多现有的多标记关联测试中,大多数旨在检测病例和对照组基因型分布差异的许多可能方面之一,例如等位基因频率差异,而少数新方法旨在针对两个或三个方面,所有这些都可以在逻辑回归中实现。与逻辑回归相反,基于基因组距离的回归(GDBR)方法旨在检测病例和对照组之间某些高阶基因型差异。最近的一项研究证实了 GDBR 测试的高功效。目前,流行的逻辑回归和新兴的 GDBR 方法完全没有关系;例如,人们必须在两者之间做出选择。在本文中,我们将 GDBR 重新表述为逻辑回归,为构建其他强大的测试开辟了途径,同时克服了 GDBR 的一些限制。例如,可以用耗时的置换来代替渐近分布来推导出 P 值,并且可以轻松地包含协变量,包括基因-基因相互作用。重要的是,这种重新表述便于将 GDBR 与逻辑回归的统一框架中的其他现有方法相结合。特别是,我们表明,Fisher 的 P 值组合方法可以通过整合 GDBR 捕获的等位基因频率、Hardy-Weinberg 不平衡、LD 模式和其他多标记之间的高阶相互作用等信息来提高统计功效。

相似文献

7
A new association test to test multiple-marker association.
Genet Epidemiol. 2009 Feb;33(2):164-71. doi: 10.1002/gepi.20369.
9
A powerful score test to detect positive selection in genome-wide scans.
Eur J Hum Genet. 2010 Oct;18(10):1148-59. doi: 10.1038/ejhg.2010.60. Epub 2010 May 12.
10
Semiparametric Allelic Tests for Mapping Multiple Phenotypes: Binomial Regression and Mahalanobis Distance.
Genet Epidemiol. 2015 Dec;39(8):635-50. doi: 10.1002/gepi.21930. Epub 2015 Oct 23.

引用本文的文献

2
Detection of epigenetic field defects using a weighted epigenetic distance-based method.
Nucleic Acids Res. 2019 Jan 10;47(1):e6. doi: 10.1093/nar/gky882.
3
Rare variants analysis using penalization methods for whole genome sequence data.
BMC Bioinformatics. 2015 Dec 4;16:405. doi: 10.1186/s12859-015-0825-4.
6
GEE-based SNP set association test for continuous and discrete traits in family-based association studies.
Genet Epidemiol. 2013 Dec;37(8):778-86. doi: 10.1002/gepi.21763. Epub 2013 Oct 25.
7
A fast multilocus test with adaptive SNP selection for large-scale genetic-association studies.
Eur J Hum Genet. 2014 May;22(5):696-702. doi: 10.1038/ejhg.2013.201. Epub 2013 Sep 11.
8
Analysis of rare, exonic variation amongst subjects with autism spectrum disorders and population controls.
PLoS Genet. 2013 Apr;9(4):e1003443. doi: 10.1371/journal.pgen.1003443. Epub 2013 Apr 11.
9
SNP set association analysis for familial data.
Genet Epidemiol. 2012 Dec;36(8):797-810. doi: 10.1002/gepi.21676. Epub 2012 Sep 11.
10
Similarity-based multimarker association tests for continuous traits.
Ann Hum Genet. 2012 May;76(3):246-60. doi: 10.1111/j.1469-1809.2012.00706.x.

本文引用的文献

1
Statistical tests of genetic association in the presence of gene-gene and gene-environment interactions.
Hum Hered. 2010;69(2):131-42. doi: 10.1159/000264450. Epub 2009 Dec 4.
2
Test selection with application to detecting disease association with multiple SNPs.
Hum Hered. 2010;69(2):120-30. doi: 10.1159/000264449. Epub 2009 Dec 4.
6
Discovering genetic ancestry using spectral graph theory.
Genet Epidemiol. 2010 Jan;34(1):51-9. doi: 10.1002/gepi.20434.
7
Genetic architecture of quantitative traits in mice, flies, and humans.
Genome Res. 2009 May;19(5):723-33. doi: 10.1101/gr.086660.108.
8
Phase uncertainty in case-control association studies.
Genet Epidemiol. 2009 Sep;33(6):463-78. doi: 10.1002/gepi.20399.
9
Asymptotic tests of association with multiple SNPs in linkage disequilibrium.
Genet Epidemiol. 2009 Sep;33(6):497-507. doi: 10.1002/gepi.20402.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验