Lee Yoonkyung, Lee Cheol-Koo
Department of Statistics, The Ohio State University, Columbus, OH 43210, USA.
Bioinformatics. 2003 Jun 12;19(9):1132-9. doi: 10.1093/bioinformatics/btg102.
High-density DNA microarray measures the activities of several thousand genes simultaneously and the gene expression profiles have been used for the cancer classification recently. This new approach promises to give better therapeutic measurements to cancer patients by diagnosing cancer types with improved accuracy. The Support Vector Machine (SVM) is one of the classification methods successfully applied to the cancer diagnosis problems. However, its optimal extension to more than two classes was not obvious, which might impose limitations in its application to multiple tumor types. We briefly introduce the Multicategory SVM, which is a recently proposed extension of the binary SVM, and apply it to multiclass cancer diagnosis problems.
Its applicability is demonstrated on the leukemia data (Golub et al., 1999) and the small round blue cell tumors of childhood data (Khan et al., 2001). Comparable classification accuracy shown in the applications and its flexibility render the MSVM a viable alternative to other classification methods.
高密度DNA微阵列能够同时测量数千个基因的活性,并且基因表达谱最近已被用于癌症分类。这种新方法有望通过更准确地诊断癌症类型,为癌症患者提供更好的治疗措施。支持向量机(SVM)是成功应用于癌症诊断问题的分类方法之一。然而,将其最优扩展到两类以上并不明显,这可能会限制其在多种肿瘤类型中的应用。我们简要介绍多类别支持向量机,它是最近提出的二元支持向量机的扩展,并将其应用于多类癌症诊断问题。
在白血病数据(Golub等人,1999年)和儿童小圆蓝细胞瘤数据(Khan等人,2001年)上证明了其适用性。应用中显示出的可比分类准确性及其灵活性,使多类别支持向量机成为其他分类方法的可行替代方案。