Glasbey Chris A, Forster Thorsten, Ghazal Peter
Biomathematics and Statistics Scotland.
Stat Appl Genet Mol Biol. 2007;6:Article34. doi: 10.2202/1544-6115.1244. Epub 2007 Dec 8.
Digital images obtained by the laser scanning of spotted microarrays often include saturated pixel values. These arise when the scan settings are sufficiently high and some pixels exceed the limit L=65535 and are instead set to L. Failure to adjust for this censoring leads to biased estimates of gene expression levels. To impute censored values, we propose a linear model based on the principal components of uncensored spots on the same array. This is computationally fast, flexible to adapt to distinctive spot shapes and profiles on different arrays, and is shown to be more effective than the polynomial-hyperbolic model in correcting for the bias. The application to biological data demonstrates the potential for enhancing the dynamic range of detection. Fortran90 subroutines implementing these methods are available at http://www.bioss.ac.uk/~chris.
通过对点阵微阵列进行激光扫描获得的数字图像通常包含饱和像素值。当扫描设置足够高且某些像素超过极限值L = 65535时,就会出现这种情况,这些像素会被设置为L。未能针对这种删失进行调整会导致基因表达水平的估计出现偏差。为了估算删失值,我们基于同一阵列上未删失斑点的主成分提出了一种线性模型。该模型计算速度快,能灵活适应不同阵列上独特的斑点形状和轮廓,并且在纠正偏差方面比多项式 - 双曲线模型更有效。在生物数据上的应用证明了增强检测动态范围的潜力。实现这些方法的Fortran90子程序可在http://www.bioss.ac.uk/~chris获取。