Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Am J Hum Genet. 2014 May 1;94(5):770-83. doi: 10.1016/j.ajhg.2014.04.004.
Currently there is great interest in detecting associations between complex traits and rare variants. In this report, we describe Variant Association Tools (VAT) and the VAT pipeline, which implements best practices for rare-variant association studies. Highlights of VAT include variant-site and call-level quality control (QC), summary statistics, phenotype- and genotype-based sample selection, variant annotation, selection of variants for association analysis, and a collection of rare-variant association methods for analyzing qualitative and quantitative traits. The association testing framework for VAT is regression based, which readily allows for flexible construction of association models with multiple covariates and weighting themes based on allele frequencies or predicted functionality. Additionally, pathway analyses, conditional analyses, and analyses of gene-gene and gene-environment interactions can be performed. VAT is capable of rapidly scanning through data by using multi-process computation, adaptive permutation, and simultaneously conducting association analysis via multiple methods. Results are available in text or graphic file formats and additionally can be output to relational databases for further annotation and filtering. An interface to R language also facilitates user implementation of novel association methods. The VAT's data QC and association-analysis pipeline can be applied to sequence, imputed, and genotyping array, e.g., "exome chip," data, providing a reliable and reproducible computational environment in which to analyze small- to large-scale studies with data from the latest genotyping and sequencing technologies. Application of the VAT pipeline is demonstrated through analysis of data from the 1000 Genomes project.
目前,人们对检测复杂性状与罕见变异之间的关联非常感兴趣。在本报告中,我们描述了 Variant Association Tools(VAT)和 VAT 管道,它们实现了罕见变异关联研究的最佳实践。VAT 的亮点包括变异位点和调用级别的质量控制(QC)、汇总统计信息、基于表型和基因型的样本选择、变异注释、用于关联分析的变异选择,以及用于分析定性和定量性状的一系列罕见变异关联方法。VAT 的关联测试框架基于回归,这使得可以灵活地构建具有多个协变量和基于等位基因频率或预测功能的权重主题的关联模型。此外,还可以进行途径分析、条件分析以及基因-基因和基因-环境相互作用的分析。VAT 能够通过使用多进程计算、自适应置换以及通过多种方法同时进行关联分析来快速扫描数据。结果以文本或图形文件格式提供,并且还可以输出到关系数据库中进行进一步的注释和过滤。与 R 语言的接口还方便了用户实现新的关联方法。VAT 的数据 QC 和关联分析管道可应用于序列、推断和基因分型阵列,例如“外显子组芯片”数据,为分析从小规模到大规模研究提供了可靠且可重复的计算环境,这些研究的数据来自最新的基因分型和测序技术。通过分析 1000 基因组计划的数据来演示 VAT 管道的应用。