Somorjai R L, Dolenko B, Mandelzweig M
Institute for Biodiagnostics, National Research Council Canada, 435 Ellice Avenue, Winnipeg, Man., Canada.
J Biomed Inform. 2007 Apr;40(2):131-8. doi: 10.1016/j.jbi.2006.04.001. Epub 2006 Apr 20.
Previously, we introduced a distance (similarity)-based mapping for the visualization of high-dimensional patterns and their relative relationships. The mapping preserves exactly the original distances from all points to any two reference patterns in a special two-dimensional coordinate system, the relative distance plane (RDP). We extend the RDP mapping's applicability from visualization to classification. Several of the classifiers use the RDP directly. These include the standard linear discriminant analysis (LDA), nearest neighbor classifiers, and a transvariation probabilities-based classification method that is natural in the RDP. Several reference directions can also be combined to create new coordinate systems in which arbitrary classifiers can be developed. We obtain increased confidence in the classification results by cycling through all possible reference pairs and computing a misclassification-based weighted accuracy. The classification results on several high-dimensional biomedical datasets are compared.
此前,我们引入了一种基于距离(相似度)的映射,用于高维模式及其相对关系的可视化。该映射在一个特殊的二维坐标系——相对距离平面(RDP)中,精确地保留了所有点到任意两个参考模式的原始距离。我们将RDP映射的适用性从可视化扩展到分类。其中一些分类器直接使用RDP。这些分类器包括标准线性判别分析(LDA)、最近邻分类器,以及一种基于转移概率的分类方法,该方法在RDP中是很自然的。还可以组合几个参考方向来创建新的坐标系,在其中可以开发任意的分类器。我们通过遍历所有可能的参考对并计算基于错误分类的加权准确率,来提高对分类结果的信心。比较了在几个高维生物医学数据集上的分类结果。