Liu W-m, Mei R, Di X, Ryder T B, Hubbell E, Dee S, Webster T A, Harrington C A, Ho M-h, Baid J, Smeekens S P
Applied Research and Product Development, Affymetrix, Inc, 3380 Central Expressway, Santa Clara, CA 95051, USA.
Bioinformatics. 2002 Dec;18(12):1593-9. doi: 10.1093/bioinformatics/18.12.1593.
We consider the detection of expressed genes and the comparison of them in different experiments with the high-density oligonucleotide microarrays. The results are summarized as the detection calls and comparison calls, and they should be robust against data outliers over a wide target concentration range. It is also helpful to provide parameters that can be adjusted by the user to balance specificity and sensitivity under various experimental conditions.
We present rank-based algorithms for making detection and comparison calls on expression microarrays. The detection call algorithm utilizes the discrimination scores. The comparison call algorithm utilizes intensity differences. Both algorithms are based on Wilcoxon's signed-rank test. Several parameters in the algorithms can be adjusted by the user to alter levels of specificity and sensitivity. The algorithms were developed and analyzed using spiked-in genes arrayed in a Latin square format. In the call process, p-values are calculated to give a confidence level for the pertinent hypotheses. For comparison calls made between two arrays, two primary normalization factors are defined. To overcome the difficulty that constant normalization factors do not fit all probe sets, we perturb these primary normalization factors and make increasing or decreasing calls only if all resulting p-values fall within a defined critical region. Our algorithms also automatically handle scanner saturation.
我们考虑利用高密度寡核苷酸微阵列检测表达基因并在不同实验中对其进行比较。结果总结为检测调用和比较调用,并且它们应在较宽的目标浓度范围内对数据异常值具有鲁棒性。提供可由用户调整的参数以在各种实验条件下平衡特异性和敏感性也很有帮助。
我们提出了基于秩的算法来对表达微阵列进行检测和比较调用。检测调用算法利用判别分数。比较调用算法利用强度差异。这两种算法均基于威尔科克森符号秩检验。算法中的几个参数可由用户调整以改变特异性和敏感性水平。这些算法是使用以拉丁方格式排列的掺入基因进行开发和分析的。在调用过程中,计算p值以给出相关假设的置信水平。对于在两个阵列之间进行的比较调用,定义了两个主要归一化因子。为了克服恒定归一化因子不适用于所有探针集的困难,我们对这些主要归一化因子进行扰动,并且仅当所有得到的p值都落在定义的临界区域内时才进行增加或减少调用。我们的算法还能自动处理扫描仪饱和问题。