Huang Huei-Chung, Qin Li-Xuan
Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
PeerJ. 2018 Apr 11;6:e4584. doi: 10.7717/peerj.4584. eCollection 2018.
Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers-an increasingly important application of microarrays in the era of personalized medicine.
In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub.
In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods.
Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy.
在微阵列研究中,由于实验操作差异导致的数据伪像普遍存在,它们可能会导致有偏差且不可重复的结果。一种常用的校正此类伪像的方法是通过事后数据调整,如数据归一化。数据归一化的统计方法主要是为发现个体分子生物标志物而开发和评估的。在多标记分子分类器的开发中,它们的性能很少被研究,而多标记分子分类器是个性化医疗时代微阵列越来越重要的应用。
在本研究中,我们着手评估三种常用数据归一化方法在分子分类背景下的性能,使用基于从同一组样本的一对独特的微小RNA微阵列数据集重新采样的广泛模拟。我们模拟的数据和代码可作为R包在GitHub上免费获取。
在存在混杂操作效应的情况下,当在独立测试数据中评估时,所有三种归一化方法都倾向于提高分类器的准确性。归一化方法之间的改进程度和相对性能取决于分子信号的相对水平、操作效应的分布模式(例如,位置偏移与尺度变化)以及用于构建分类器的统计方法。此外,对于所有三种归一化方法,交叉验证与分类准确性的过度乐观方向的偏差估计相关。
归一化可能会提高具有混杂操作效应的数据的分子分类准确性;然而,它无法规避与用于评估分类准确性的交叉验证相关的过度乐观结果。