Carvalho Paulo Costa, Carvalho Maria da Gloria Costa, Degrave Wim, Lilla Sergio, De Nucci Gilberto, Fonseca Raul, Spector Nelson, Musacchio Juliane, Domont Gilberto Barbosa
Laboratory for Functional Genomics and Bioinformatics, Oswaldo Cruz Institute, Rio de Janeiro, Brazil.
J Exp Ther Oncol. 2007;6(2):137-45.
More than 90% of patients with cancer, if diagnosed early, can be promptly treated; however diagnosis usually occurs after cancer cells have metastasized. Recent technological advances in mass spectrometry challenges the field of machine learning to model such high dimensional datasets for clinical diagnosis and prognosis. Here we use support vector machines recursive feature elimination to hunt for protein expression patterns in the serum mass spectra of Hodgkin's disease (HD) patients and control subjects (CS) that could aid in diagnosing-the disease. Based on eight selected features, support vector machines was able to correctly classify among all CS and HD patients based on the leave-one-out. We also correctly classified an independent dataset, acquired from the same samples, with the previously generated SVM model.
超过90%的癌症患者若能早期诊断,便可得到及时治疗;然而,诊断通常在癌细胞转移后才进行。质谱技术的最新进展促使机器学习领域为临床诊断和预后对如此高维的数据集进行建模。在此,我们使用支持向量机递归特征消除法来探寻霍奇金淋巴瘤(HD)患者和对照受试者(CS)血清质谱中的蛋白质表达模式,以辅助疾病诊断。基于八个选定特征,支持向量机能够通过留一法在所有对照受试者和HD患者中进行正确分类。我们还用之前生成的支持向量机模型对从相同样本获取的独立数据集进行了正确分类。