Matheny Michael E, Resnic Frederic S, Arora Nipun, Ohno-Machado Lucila
Decision Systems Group, Brigham & Women's Hospital, 75 Francis Street, Boston, MA 02115, USA.
J Biomed Inform. 2007 Dec;40(6):688-97. doi: 10.1016/j.jbi.2007.05.008. Epub 2007 May 18.
Support vector machines (SVM) have become popular among machine learning researchers, but their applications in biomedicine have been somewhat limited. A number of methods, such as grid search and evolutionary algorithms, have been utilized to optimize model parameters of SVMs. The sensitivity of the results to changes in optimization methods has not been investigated in the context of medical applications. In this study, radial-basis kernel SVM and polynomial kernel SVM mortality prediction models for percutaneous coronary interventions were optimized using (a) mean-squared error, (b) mean cross-entropy error, (c) the area under the receiver operating characteristic, and (d) the Hosmer-Lemeshow goodness-of-fit test (HL chi(2)). A threefold cross-validation inner and outer loop method was used to select the best models using the training data, and evaluations were based on previously unseen test data. The results were compared to those produced by logistic regression models optimized using the same indices. The choice of optimization parameters had a significant impact on performance in both SVM kernel types.
支持向量机(SVM)在机器学习研究人员中颇受欢迎,但其在生物医学中的应用却受到一定限制。人们已采用多种方法(如网格搜索和进化算法)来优化支持向量机的模型参数。在医学应用背景下,尚未对结果对优化方法变化的敏感性进行研究。在本研究中,使用以下方法对经皮冠状动脉介入治疗的径向基核支持向量机和多项式核支持向量机死亡率预测模型进行了优化:(a)均方误差,(b)平均交叉熵误差,(c)受试者工作特征曲线下面积,以及(d)Hosmer-Lemeshow拟合优度检验(HL chi(2))。采用三重交叉验证内循环和外循环方法,利用训练数据选择最佳模型,并基于之前未见过的测试数据进行评估。将结果与使用相同指标优化的逻辑回归模型的结果进行比较。优化参数的选择对两种支持向量机内核类型的性能均有显著影响。