Reed Eric, Nunez Sara, Kulp David, Qian Jing, Reilly Muredach P, Foulkes Andrea S
Department of Mathematics and Statistics, Mount Holyoke College, South Hadley, MA, U.S.A.
Department of Computer Science, University of Massachusetts, Amherst, MA, U.S.A.
Stat Med. 2015 Dec 10;34(28):3769-92. doi: 10.1002/sim.6605. Epub 2015 Sep 6.
This tutorial is a learning resource that outlines the basic process and provides specific software tools for implementing a complete genome-wide association analysis. Approaches to post-analytic visualization and interrogation of potentially novel findings are also presented. Applications are illustrated using the free and open-source R statistical computing and graphics software environment, Bioconductor software for bioinformatics and the UCSC Genome Browser. Complete genome-wide association data on 1401 individuals across 861,473 typed single nucleotide polymorphisms from the PennCATH study of coronary artery disease are used for illustration. All data and code, as well as additional instructional resources, are publicly available through the Open Resources in Statistical Genomics project: http://www.stat-gen.org.
本教程是一种学习资源,概述了基本流程,并提供了用于实施全基因组关联分析的特定软件工具。还介绍了分析后对潜在新发现进行可视化和探究的方法。使用免费的开源R统计计算和图形软件环境、用于生物信息学的Bioconductor软件以及UCSC基因组浏览器来说明应用。来自宾夕法尼亚冠状动脉疾病CATH研究的1401名个体的861473个分型单核苷酸多态性的全基因组关联数据用于说明。所有数据、代码以及其他教学资源均可通过统计基因组学开放资源项目公开获取:http://www.stat-gen.org 。