Hu Xue, Yu Zebo
Department of Blood Transfusion, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400016, P.R. China.
Oncol Lett. 2019 Feb;17(2):1483-1490. doi: 10.3892/ol.2018.9761. Epub 2018 Nov 26.
Malignant mesothelioma (MM) is a rare but aggressive cancer. The definitive diagnosis of MM is critical for effective treatment and has important medicolegal significance. However, the definitive diagnosis of MM is challenging due to its composite epithelial/mesenchymal pattern. The aim of the current study was to develop a deep learning method to automatically diagnose MM. A retrospective analysis of 324 participants with or without MM was performed. Significant features were selected using a genetic algorithm (GA) or a ReliefF algorithm performed in MATLAB software. Subsequently, the current study constructed and trained several models based on a backpropagation (BP) algorithm, extreme learning machine algorithm and stacked sparse autoencoder (SSAE) to diagnose MM. A confusion matrix, F-measure and a receiver operating characteristic (ROC) curve were used to evaluate the performance of each model. A total of 34 potential variables were analyzed, while the GA and ReliefF algorithms selected 19 and 5 effective features, respectively. The selected features were used as the inputs of the three models. SSAE and GA+SSAE demonstrated the highest performance in terms of classification accuracy, specificity, F-measure and the area under the ROC curve. Overall, the GA+SSAE model was the preferred model since it required a shorter CPU time and fewer variables. Therefore, the SSAE with GA feature selection was selected as the most accurate model for the diagnosis of MM. The deep learning methods developed based on the GA+SSAE model may assist physicians with the diagnosis of MM.
恶性间皮瘤(MM)是一种罕见但侵袭性很强的癌症。MM的明确诊断对于有效治疗至关重要,并且具有重要的法医学意义。然而,由于其复合上皮/间充质模式,MM的明确诊断具有挑战性。本研究的目的是开发一种深度学习方法来自动诊断MM。对324名患有或未患有MM的参与者进行了回顾性分析。使用遗传算法(GA)或在MATLAB软件中执行的ReliefF算法选择显著特征。随后,本研究基于反向传播(BP)算法、极限学习机算法和堆叠稀疏自编码器(SSAE)构建并训练了多个模型来诊断MM。使用混淆矩阵、F值和受试者工作特征(ROC)曲线来评估每个模型的性能。总共分析了34个潜在变量,而GA和ReliefF算法分别选择了19个和5个有效特征。所选特征用作三个模型的输入。在分类准确率、特异性、F值和ROC曲线下面积方面,SSAE和GA+SSAE表现出最高的性能。总体而言,GA+SSAE模型是首选模型,因为它需要更短的CPU时间和更少的变量。因此,选择具有GA特征选择的SSAE作为诊断MM的最准确模型。基于GA+SSAE模型开发的深度学习方法可能有助于医生诊断MM。