Petersen Ashley, Spratt Justin, Tintle Nathan L
Department of Biostatistics, University of Washington, Seattle, WA, USA.
Methods Mol Biol. 2013;1019:519-41. doi: 10.1007/978-1-62703-447-0_25.
Typical methods of analyzing genome-wide single nucleotide variant (SNV) data in cases and controls involve testing each variant's genotypes separately for phenotype association, and then using a substantial multiple-testing penalty to minimize the rate of false positives. This approach, however, can result in low power for modestly associated SNVs. Furthermore, simply looking at the most associated SNVs may not directly yield biological insights about disease etiology. SNVset methods attempt to address both limitations of the traditional approach by testing biologically meaningful sets of SNVs (e.g., genes or pathways). The number of tests run in a SNVset analysis is typically much lower (hundreds or thousands instead of millions) than in a traditional analysis, so the false-positive rate is lower. Additionally, by testing SNVsets that are biologically meaningful finding a significant set may more quickly yield insights into disease etiology.In this chapter we summarize the short history of SNVset testing and provide an overview of the many recently proposed methods. Furthermore, we provide detailed step-by-step instructions on how to perform a SNVset analysis, including a substantial number of practical tips and questions that researchers should consider before undertaking a SNVset analysis. Lastly, we describe a companion R package (snvset) that implements recently proposed SNVset methods. While SNVset testing is a new approach, with many new methods still being developed and many open questions, the promise of the approach is worth serious consideration when considering analytic methods for GWAS.
在病例组和对照组中分析全基因组单核苷酸变异(SNV)数据的典型方法包括分别测试每个变异的基因型与表型的关联性,然后使用严格的多重检验校正来最小化假阳性率。然而,这种方法对于关联程度适中的SNV可能导致检验效能较低。此外,仅仅关注关联性最强的SNV可能无法直接获得有关疾病病因的生物学见解。SNV集方法试图通过测试具有生物学意义的SNV集(例如基因或通路)来解决传统方法的这两个局限性。在SNV集分析中进行的检验数量通常比传统分析少得多(数百或数千而不是数百万),因此假阳性率较低。此外,通过测试具有生物学意义的SNV集,发现一个显著的集合可能会更快地产生对疾病病因的见解。在本章中,我们总结了SNV集检验的简短历史,并概述了许多最近提出的方法。此外,我们提供了关于如何进行SNV集分析的详细分步说明,包括大量实用提示以及研究人员在进行SNV集分析之前应考虑的问题。最后,我们描述了一个配套的R包(snvset),它实现了最近提出的SNV集方法。虽然SNV集检验是一种新方法,仍有许多新方法正在开发且存在许多未解决的问题,但在考虑全基因组关联研究(GWAS)的分析方法时,该方法的前景值得认真考虑。