基于排列的基因检验的性质和使用基于汇总统计量的基因检验控制Ⅰ类错误。

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.

机构信息

Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts.

出版信息

BMC Genet. 2013 Nov 7;14:108. doi: 10.1186/1471-2156-14-108.

DOI:10.1186/1471-2156-14-108

PMID:24199751

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3831057/

Abstract

BACKGROUND

The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based.

RESULTS

One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate.

CONCLUSION

We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known.

摘要

背景

全基因组关联研究的出现导致了许多新的疾病-SNP 关联，为研究其生物学基础开辟了道路。由于分析这些关联的重要性，许多统计方法都致力于此。然而，很少有方法试图将整个基因或基因组区域与结果联系起来，从生物学角度来看，这可能是更有用的知识，而目前实施的那些方法通常是基于排列的。

结果

从理论角度出发，我们展示了一些排列检验的一个特性，即它们的功效随显著标记是否处于连锁不平衡（LD）区域而变化。因此，我们开发了两种用于量化基因组区域与结果之间关联程度的方法，这两种方法的功效都不随 LD 结构的变化而变化。一种方法使用降维来“过滤”区域中存在显著 LD 时的冗余信息，而另一种方法，称为汇总统计检验，通过使用标记相关矩阵的知识来缩放标记 Z 统计量来控制 LD。后者检验的一个优点是它不需要原始数据，只需要它们来自单变量回归的 Z 统计量和标记相关结构的估计值，并且我们展示了如何修改检验以保护标记相关结构指定错误时的类型 1错误率。我们将这些方法应用于口腔裂的序列数据，并将我们的结果与以前提出的基因检验进行比较，特别是基于排列的检验。我们评估了汇总统计检验的修改方法的多功能性，因为标记之间的相关结构的规范可能不准确。

结论

我们使用我们的降维方法在序列数据中发现 8q24 区域与口腔裂之间存在显著关联，并且使用基于汇总统计的方法发现了边缘显著关联。我们还使用已经发表的慢性阻塞性肺疾病（COPD）全基因组关联研究的 Z 统计量和从 HapMap 获得的相关结构来实施汇总统计检验。我们尝试修改该检验，因为相关结构被假定为不完全已知。

相似文献

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.基于排列的基因检验的性质和使用基于汇总统计量的基因检验控制Ⅰ类错误。

BMC Genet. 2013 Nov 7;14:108. doi: 10.1186/1471-2156-14-108.

Performance of a blockwise approach in variable selection using linkage disequilibrium information.使用连锁不平衡信息进行变量选择时的分块方法性能。

BMC Bioinformatics. 2015 May 8;16:148. doi: 10.1186/s12859-015-0556-6.

DOT: Gene-set analysis by combining decorrelated association statistics.基因集分析通过结合去相关关联统计。

PLoS Comput Biol. 2020 Apr 14;16(4):e1007819. doi: 10.1371/journal.pcbi.1007819. eCollection 2020 Apr.

An approach to gene-based testing accounting for dependence of tests among nearby genes.一种基于基因的测试方法，该方法考虑了附近基因之间测试的相关性。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab329.

Permutation-based approaches do not adequately allow for linkage disequilibrium in gene-wide multi-locus association analysis.基于排列的方法在全基因多位点关联分析中不能充分考虑连锁不平衡。

Eur J Hum Genet. 2012 Aug;20(8):890-6. doi: 10.1038/ejhg.2012.8. Epub 2012 Feb 8.

An adaptive test based on principal components for detecting multiple phenotype associations using GWAS summary data.一种基于主成分的适应性检验，用于利用全基因组关联研究（GWAS）汇总数据检测多种表型关联。

Genetica. 2023 Apr;151(2):97-104. doi: 10.1007/s10709-023-00179-9. Epub 2023 Jan 19.

Gene-gene interaction of single nucleotide polymorphisms in 16p13.3 may contribute to the risk of non-syndromic cleft lip with or without cleft palate in Chinese case-parent trios.16p13.3区域单核苷酸多态性的基因-基因相互作用可能会增加中国病例-父母三联体中患非综合征性唇裂伴或不伴腭裂的风险。

Am J Med Genet A. 2017 Jun;173(6):1489-1494. doi: 10.1002/ajmg.a.38190. Epub 2017 Apr 12.

High resolution T association tests of complex diseases based on family data.基于家系数据的复杂疾病高分辨率全基因组关联测试

Ann Hum Genet. 2005 Mar;69(Pt 2):187-208. doi: 10.1046/j.1529-8817.2004.00151.x.

Kernel machine SNP-set analysis for censored survival outcomes in genome-wide association studies.基于核机器的全基因组关联研究中截尾生存结局的 SNP 集分析。

Genet Epidemiol. 2011 Nov;35(7):620-31. doi: 10.1002/gepi.20610. Epub 2011 Aug 4.

Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.对下一代测序基因关联进行单变量和多变量趋势检验，对测序错误具有稳健性。

Hum Hered. 2012;74(3-4):172-83. doi: 10.1159/000346824. Epub 2013 Apr 11.

引用本文的文献

Pathway Analysis of Renal Cell Carcinoma Genome-Wide Association Studies Identifies Novel Associations.全基因组关联研究分析肾细胞癌通路，鉴定出新的关联。

Cancer Epidemiol Biomarkers Prev. 2020 Oct;29(10):2065-2069. doi: 10.1158/1055-9965.EPI-20-0472. Epub 2020 Jul 30.

A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations.一种使用汇总统计量进行基于通路的荟萃分析的强大方法，识别出欧洲人群中与2型糖尿病相关的43条通路。

PLoS Genet. 2016 Jun 30;12(6):e1006122. doi: 10.1371/journal.pgen.1006122. eCollection 2016 Jun.

A method for gene-based pathway analysis using genomewide association study summary statistics reveals nine new type 1 diabetes associations.一种利用全基因组关联研究汇总统计数据进行基于基因的通路分析的方法揭示了9种新的1型糖尿病关联。

Genet Epidemiol. 2014 Dec;38(8):661-70. doi: 10.1002/gepi.21853. Epub 2014 Nov 4.

本文引用的文献

Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits.条件和联合多位点 GWAS 汇总统计分析确定了影响复杂性状的其他变体。

Nat Genet. 2012 Mar 18;44(4):369-75, S1-3. doi: 10.1038/ng.2213.

Eur J Hum Genet. 2012 Aug;20(8):890-6. doi: 10.1038/ejhg.2012.8. Epub 2012 Feb 8.

GATES: a rapid and powerful gene-based association test using extended Simes procedure.盖茨：一种基于基因的快速而强大的关联测试方法，使用扩展的西门斯程序。

Am J Hum Genet. 2011 Mar 11;88(3):283-93. doi: 10.1016/j.ajhg.2011.01.019.

A versatile gene-based test for genome-wide association studies.一种用于全基因组关联研究的多功能基因检测方法。

Am J Hum Genet. 2010 Jul 9;87(1):139-45. doi: 10.1016/j.ajhg.2010.06.009.

Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate.全基因组关联研究鉴定出非综合征性唇裂伴或不伴腭裂的两个易感位点。

Nat Genet. 2010 Jan;42(1):24-6. doi: 10.1038/ng.506. Epub 2009 Dec 20.

A genome-wide association study in chronic obstructive pulmonary disease (COPD): identification of two major susceptibility loci.一项慢性阻塞性肺疾病（COPD）的全基因组关联研究：两个主要易感基因座的鉴定。

PLoS Genet. 2009 Mar;5(3):e1000421. doi: 10.1371/journal.pgen.1000421. Epub 2009 Mar 20.

ATOM: a powerful gene-based association test by combining optimally weighted markers.ATOM：一种通过组合最优加权标记进行的强大的基于基因的关联测试。

Bioinformatics. 2009 Feb 15;25(4):497-503. doi: 10.1093/bioinformatics/btn641. Epub 2008 Dec 15.

Am J Hum Genet. 2007 Dec;81(6):1158-68. doi: 10.1086/522036.

A principal components regression approach to multilocus genetic association studies.一种用于多位点基因关联研究的主成分回归方法。

Genet Epidemiol. 2008 Feb;32(2):108-18. doi: 10.1002/gepi.20266.

PLINK: a tool set for whole-genome association and population-based linkage analyses.PLINK：一个用于全基因组关联分析和基于群体的连锁分析的工具集。

Am J Hum Genet. 2007 Sep;81(3):559-75. doi: 10.1086/519795. Epub 2007 Jul 25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于排列的基因检验的性质和使用基于汇总统计量的基因检验控制Ⅰ类错误。

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献