James Andrew C, Veitch Jim G, Zareh Ali R, Triche Timothy
Research School of Biological Sciences, Australian National University, PO Box 475, Canberra ACT 2601, Australia.
Bioinformatics. 2004 May 1;20(7):1060-5. doi: 10.1093/bioinformatics/bth038. Epub 2004 Feb 5.
A number of algorithms have been proposed for the processing of feature-level data from high-density oligonucleotide microarrays to give estimates of transcript abundance. Performance in the common task of detecting differential expression between samples can be quantified by the statistical concepts of sensitivity and specificity, and represented by the use of receiver operating characteristic curves. These have been previously presented for small numbers of genes known to be differentially present in spiked-in samples. We present here a study of performance over a large number (thousands) of transcripts for which there is strong evidence of differential expression, with corresponding false positive rates controlled by comparisons between replicates.
The straight-line regression analysis of a mixture series with replicates by five estimation algorithms produces a consensus set of 4462 transcripts with differential expression of agreed direction and high significance (p < 0.01) according to all algorithms. The more difficult task of two-sample tests between adjacent mixture levels produces performance curves of fraction true positive detected against significance level. Performance varies significantly between algorithms: at the p < 0.01 level, the detection rate varies between 41 and 66%. A control using comparisons between replicates at the same levels indicates that the tests produce empirical false positive rates closely matching the nominal p-values.
已经提出了许多算法用于处理来自高密度寡核苷酸微阵列的特征级数据,以估计转录本丰度。样本间差异表达检测这一常见任务中的性能可以通过敏感性和特异性的统计概念进行量化,并通过使用接收者操作特征曲线来表示。此前这些曲线是针对已知在掺入样本中差异存在的少数基因呈现的。我们在此展示了一项针对大量(数千个)转录本的性能研究,这些转录本有差异表达的有力证据,且通过重复样本间的比较控制了相应的假阳性率。
通过五种估计算法对带有重复样本的混合系列进行直线回归分析,得出了一组由4462个转录本组成的共识集,根据所有算法,这些转录本具有一致方向且高度显著(p < 0.01)的差异表达。相邻混合水平之间的双样本测试这一更具挑战性的任务产生了检测到的真阳性分数与显著性水平的性能曲线。不同算法之间的性能差异显著:在p < 0.01水平时,检测率在41%至66%之间变化。使用相同水平重复样本间比较的对照表明,这些测试产生了与名义p值紧密匹配的经验性假阳性率。