De Bruyne Veronique, Al-Mulla Fahd, Pot Bruno
Applied-Maths BVBA, Sint-Martens-Latem.
Methods Mol Biol. 2007;382:373-91. doi: 10.1007/978-1-59745-304-2_23.
This chapter outlines a typical workflow for micraorray data analysis. It aims at explaining the background of the methods as this is necessary for deciding upon a specific numerical method to use and for understanding and interpreting the outcomes of the analyses. We focus on error handling, various steps during preprocessing (clipping, imputing missing values, normalization, and transformation of data), statistic tests for variable selection and the use of multiple hypothesis testing procedures, various metrics and clustering algorithms for hierarchical clustering, principles, and results from principal components analysis and discriminant analysis, partitioning, self-organizing map, K-nearest neighbor classifier, and the use of a neural network and a support vector machine for classification.
本章概述了微阵列数据分析的典型工作流程。其目的是解释这些方法的背景,因为这对于决定使用特定的数值方法以及理解和解释分析结果是必要的。我们重点关注错误处理、预处理过程中的各个步骤(裁剪、插补缺失值、归一化和数据转换)、用于变量选择的统计测试以及多重假设检验程序的使用、用于层次聚类的各种度量和聚类算法、主成分分析和判别分析的原理及结果、划分、自组织映射、K近邻分类器,以及使用神经网络和支持向量机进行分类。