Fang Yongxiang, Brass Andrew, Hoyle David C, Hayes Andrew, Bashein Abdulla, Oliver Stephen G, Waddington David, Rattray Magnus
School of Biological Sciences, University of Manchester, Manchester M13 9PT, UK.
Nucleic Acids Res. 2003 Aug 15;31(16):e96. doi: 10.1093/nar/gng097.
A statistical model is proposed for the analysis of errors in microarray experiments and is employed in the analysis and development of a combined normalisation regime. Through analysis of the model and two-dye microarray data sets, this study found the following. The systematic error introduced by microarray experiments mainly involves spot intensity-dependent, feature-specific and spot position-dependent contributions. It is difficult to remove all these errors effectively without a suitable combined normalisation operation. Adaptive normalisation using a suitable regression technique is more effective in removing spot intensity-related dye bias than self-normalisation, while regional normalisation (block normalisation) is an effective way to correct spot position-dependent errors. However, dye-flip replicates are necessary to remove feature-specific errors, and also allow the analyst to identify the experimentally introduced dye bias contained in non-self-self data sets. In this case, the bias present in the data sets may include both experimentally introduced dye bias and the biological difference between two samples. Self-normalisation is capable of removing dye bias without identifying the nature of that bias. The performance of adaptive normalisation, on the other hand, depends on its ability to correctly identify the dye bias. If adaptive normalisation is combined with an effective dye bias identification method then there is no systematic difference between the outcomes of the two methods.
提出了一种统计模型用于分析微阵列实验中的误差,并将其应用于组合归一化方案的分析和开发。通过对该模型和双色微阵列数据集的分析,本研究发现如下情况。微阵列实验引入的系统误差主要涉及与点强度相关、特征特异性和与点位置相关的贡献。如果没有合适的组合归一化操作,很难有效消除所有这些误差。使用合适的回归技术进行自适应归一化在消除与点强度相关的染料偏差方面比自我归一化更有效,而区域归一化(块归一化)是校正与点位置相关误差的有效方法。然而,需要染料翻转重复实验来消除特征特异性误差,并且还能让分析人员识别非自我-自我数据集中实验引入的染料偏差。在这种情况下,数据集中存在的偏差可能包括实验引入的染料偏差和两个样本之间的生物学差异。自我归一化能够消除染料偏差而无需识别该偏差的性质。另一方面,自适应归一化的性能取决于其正确识别染料偏差的能力。如果自适应归一化与有效的染料偏差识别方法相结合,那么这两种方法的结果之间就没有系统差异。