Bhowmick Debjani, Davison A C, Goldstein Darlene R, Ruffieux Yann
Ecole Polytechnique Fédérale de Lausanne, Institute of Mathematics, EPFL-FSB-IMA, Station 8, Lausanne, Switzerland.
Biostatistics. 2006 Oct;7(4):630-41. doi: 10.1093/biostatistics/kxj032. Epub 2006 Mar 24.
Microarrays have become an important tool for studying the molecular basis of complex disease traits and fundamental biological processes. A common purpose of microarray experiments is the detection of genes that are differentially expressed under two conditions, such as treatment versus control or wild type versus knockout. We introduce a Laplace mixture model as a long-tailed alternative to the normal distribution when identifying differentially expressed genes in microarray experiments, and provide an extension to asymmetric over- or underexpression. This model permits greater flexibility than models in current use as it has the potential, at least with sufficient data, to accommodate both whole genome and restricted coverage arrays. We also propose likelihood approaches to hyperparameter estimation which are equally applicable in the Normal mixture case. The Laplace model appears to give some improvement in fit to data, though simulation studies show that our method performs similarly to several other statistical approaches to the problem of identification of differential expression.
微阵列已成为研究复杂疾病性状的分子基础和基本生物学过程的重要工具。微阵列实验的一个常见目的是检测在两种条件下差异表达的基因,例如处理组与对照组或野生型与基因敲除组。在微阵列实验中识别差异表达基因时,我们引入拉普拉斯混合模型作为正态分布的长尾替代模型,并对不对称过表达或低表达进行扩展。该模型比目前使用的模型具有更大的灵活性,因为至少在有足够数据时,它有可能适用于全基因组和有限覆盖阵列。我们还提出了超参数估计的似然方法,这些方法同样适用于正态混合情况。拉普拉斯模型似乎在数据拟合方面有一定改进,不过模拟研究表明,我们的方法在识别差异表达问题上与其他几种统计方法表现相似。