Clayton David, Leung Hin-Tak
Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK.
Hum Hered. 2007;64(1):45-51. doi: 10.1159/000101422. Epub 2007 Apr 27.
To provide data classes and methods to facilitate the analysis of whole genome association studies in the R language for statistical computing.
We have implemented data classes in which each genotype call is stored as a single byte. At this density, data for single chromosomes derived from large studies and new high-throughput gene chip platforms can be handled in memory. We use the object-oriented programming model introduced with version 4 of the S-plus package, usually termed 'S4 methods'.
At the current state of development the package only supports population-based studies, although we would hope to provide support for family-based studies soon. Both quantitative and qualitative phenotypes may be analysed. Flexible association testing functions are provided which can carry out single SNP tests which control for potential confounding by quantitative and qualitative covariates. Tests involving several SNPs taken together as 'tags' are also supported. Efficient calculation of pair-wise linkage disequilibrium measures is implemented and data input functions include a function which can download data directly from the international HapMap project website.
提供数据类和方法,以便于在用于统计计算的R语言中分析全基因组关联研究。
我们实现了数据类,其中每个基因型调用都存储为一个单字节。在这种密度下,来自大型研究和新的高通量基因芯片平台的单条染色体数据可以在内存中处理。我们使用S-plus软件包版本4引入的面向对象编程模型,通常称为“S4方法”。
在当前的开发状态下,该软件包仅支持基于人群的研究,不过我们希望很快能为基于家系的研究提供支持。定量和定性表型均可进行分析。提供了灵活的关联测试函数,可进行单核苷酸多态性(SNP)测试,控制定量和定性协变量的潜在混杂因素。还支持将几个SNP作为“标签”一起进行的测试。实现了成对连锁不平衡度量的高效计算,数据输入函数包括一个可直接从国际HapMap项目网站下载数据的函数。