Ott Jurg, Hoh Josephine
Rockefeller University, 1230 York Avenue, New York, NY 10021, USA.
J Comput Biol. 2003;10(3-4):569-74. doi: 10.1089/10665270360688192.
Common heritable diseases ("complex traits") are assumed to be due to multiple underlying susceptibility genes. While genetic mapping methods for Mendelian disorders have been very successful, the search for genes underlying complex traits has been difficult and often disappointing. One of the reasons may be that most current gene-mapping approaches are still based on conventional methodology of testing one or a few SNPs at a time. Here, we demonstrate a simple strategy that allows for the joint analysis of multiple disease-associated SNPs in different genomic regions. Our set-association method combines information over SNPs by forming sums of relevant single-marker statistics. As previously hypothesized, we show here that this approach successfully addresses the "curse of dimensionality" problem--too many variables should be estimated with a comparatively small number of observations. We also report results of simulation studies showing that our method furnishes unbiased and accurate significance levels. Power calculations demonstrate good power even in the presence of large numbers of nondisease associated SNPs. We extended our method to microarray expression data, where expression levels for large numbers of genes should be compared between two tissue types. In applications to such data, our approach turned out to be highly efficient.
常见的遗传性疾病(“复杂性状”)被认为是由多个潜在的易感基因所致。虽然针对孟德尔疾病的基因定位方法非常成功,但寻找复杂性状背后的基因却一直困难重重且常常令人失望。原因之一可能是,目前大多数基因定位方法仍基于一次检测一个或几个单核苷酸多态性(SNP)的传统方法。在此,我们展示了一种简单的策略,该策略允许对不同基因组区域中多个与疾病相关的SNP进行联合分析。我们的集合关联方法通过形成相关单标记统计量的总和来整合SNP的信息。正如之前所假设的,我们在此表明,这种方法成功解决了“维度诅咒”问题——即要用相对较少的观测值来估计过多的变量。我们还报告了模拟研究的结果,表明我们的方法能提供无偏且准确的显著性水平。功效计算表明,即使存在大量与疾病无关的SNP,我们的方法仍具有良好的功效。我们将我们的方法扩展到微阵列表达数据,即应在两种组织类型之间比较大量基因的表达水平。在对此类数据的应用中,我们的方法被证明是非常高效的。