Suppr超能文献

通过有效的自适应评分检验识别与表型相关的遗传标记集。

Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test.

机构信息

Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA.

出版信息

Biostatistics. 2012 Sep;13(4):776-90. doi: 10.1093/biostatistics/kxs015. Epub 2012 Jun 25.

Abstract

In recent years, genome-wide association studies (GWAS) and gene-expression profiling have generated a large number of valuable datasets for assessing how genetic variations are related to disease outcomes. With such datasets, it is often of interest to assess the overall effect of a set of genetic markers, assembled based on biological knowledge. Genetic marker-set analyses have been advocated as more reliable and powerful approaches compared with the traditional marginal approaches (Curtis and others, 2005. Pathways to the analysis of microarray data. TRENDS in Biotechnology 23, 429-435; Efroni and others, 2007. Identification of key processes underlying cancer phenotypes using biologic pathway analysis. PLoS One 2, 425). Procedures for testing the overall effect of a marker-set have been actively studied in recent years. For example, score tests derived under an Empirical Bayes (EB) framework (Liu and others, 2007. Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models. Biometrics 63, 1079-1088; Liu and others, 2008. Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models. BMC bioinformatics 9, 292-2; Wu and others, 2010. Powerful SNP-set analysis for case-control genome-wide association studies. American Journal of Human Genetics 86, 929) have been proposed as powerful alternatives to the standard Rao score test (Rao, 1948. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 44, 50-57). The advantages of these EB-based tests are most apparent when the markers are correlated, due to the reduction in the degrees of freedom. In this paper, we propose an adaptive score test which up- or down-weights the contributions from each member of the marker-set based on the Z-scores of their effects. Such an adaptive procedure gains power over the existing procedures when the signal is sparse and the correlation among the markers is weak. By combining evidence from both the EB-based score test and the adaptive test, we further construct an omnibus test that attains good power in most settings. The null distributions of the proposed test statistics can be approximated well either via simple perturbation procedures or via distributional approximations. Through extensive simulation studies, we demonstrate that the proposed procedures perform well in finite samples. We apply the tests to a breast cancer genetic study to assess the overall effect of the FGFR2 gene on breast cancer risk.

摘要

近年来,全基因组关联研究(GWAS)和基因表达谱分析产生了大量有价值的数据集,可用于评估遗传变异与疾病结果之间的关系。有了这些数据集,通常有兴趣评估一组遗传标记的总体效应,这些标记是根据生物学知识组装的。与传统的边际方法(Curtis 等人,2005 年。微阵列数据分析的途径。生物技术趋势 23,429-435;Efroni 等人,2007 年。使用生物途径分析鉴定癌症表型的关键过程。公共科学图书馆一号 2,425)相比,遗传标记集分析已被提倡为更可靠和强大的方法。近年来,人们积极研究了检验标记集总体效应的程序。例如,在经验贝叶斯(EB)框架下推导的评分检验(Liu 等人,2007 年。多维遗传途径数据的半参数回归:最小二乘核机和线性混合模型。生物统计学 63,1079-1088;Liu 等人,2008 年。使用逻辑核机回归通过逻辑混合模型对遗传途径对疾病结果的影响进行估计和检验。生物信息学 9,292-2;Wu 等人,2010 年。用于病例对照全基因组关联研究的强大 SNP 集分析。美国人类遗传学杂志 86,929)已被提议作为标准 Rao 评分检验(Rao,1948 年。关于多个参数的统计假设的大样本检验及其在估计问题中的应用。剑桥哲学学会数学会刊,44,50-57)的有力替代品。由于自由度的减少,当标记物相关时,这些基于 EB 的检验的优势最为明显。在本文中,我们提出了一种自适应评分检验,根据其效应的 Z 分数对标记集的每个成员的贡献进行上下加权。当信号稀疏且标记物之间的相关性较弱时,这种自适应程序比现有程序更具优势。通过结合基于 EB 的评分检验和自适应检验的证据,我们进一步构建了一个综合检验,在大多数情况下都能获得良好的功效。所提出的检验统计量的零分布可以通过简单的摄动程序或分布逼近很好地近似。通过广泛的模拟研究,我们证明了这些程序在有限样本中表现良好。我们将这些检验应用于乳腺癌遗传研究,以评估 FGFR2 基因对乳腺癌风险的总体影响。

相似文献

10
A gene based combination test using GWAS summary data.基于 GWAS 汇总数据的基因组合测试。
BMC Bioinformatics. 2023 Jan 3;24(1):2. doi: 10.1186/s12859-022-05114-x.

引用本文的文献

8
An Adaptive Genetic Association Test Using Double Kernel Machines.一种使用双内核机器的自适应基因关联测试。
Stat Biosci. 2015 Oct 1;7(2):262-281. doi: 10.1007/s12561-014-9116-2. Epub 2014 Jun 24.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验