Xue Zhiyun, Liang Zhaohui, Rajaraman Sivaramakrishnan, Marini Niccolo, Antani Sameer
Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
IEEE Int Conf Omnilayer Intell Syst. 2025 Aug;2025. doi: 10.1109/coins65080.2025.11125786. Epub 2025 Aug 22.
Oral cancer has one of the lowest five-year survival rates among major cancer types. Therefore, early detection is crucial for histopathological confirmation. State-of-the-art methods reported in the literature largely analyze images only for oral cancer prediction. The use of deep learning networks for related tabular medical data remains unexplored for oral cancer and understudied in general. As part of our multimodal AI/ML approach toward reliable prediction of candidate lesions to biopsy, we describe our work in deep learning approaches on a fielded clinical structured text data in spreadsheet format (tabular data) on a subset comprising 1791 patients drawn from a large ongoing oral cancer study to classify patients with a cancerous lesion from those with a precancerous lesion (i.e., direct precursor to cancer). We compare two tabular deep learning methods and one conventional algorithm for the predictive data analysis. The experimental results on a hold-out test set demonstrate a promising performance for all models (Youden index > 0.6 and AUC > 0.9). In addition, we examine and analyze the interpretability of models. All models indicate that lesion characteristics are crucial predictive features. The insights and results obtained from this work would be valuable to the research community in application of AI/ML to biomedicine.
口腔癌在主要癌症类型中五年生存率是最低的之一。因此,早期检测对于组织病理学确诊至关重要。文献中报道的最先进方法主要仅针对口腔癌预测分析图像。对于口腔癌,深度学习网络在相关表格医学数据方面的应用尚未得到探索,总体上研究也较少。作为我们用于可靠预测活检候选病变的多模态人工智能/机器学习方法的一部分,我们描述了我们在深度学习方法方面的工作,该方法应用于来自一项正在进行的大型口腔癌研究的包含1791名患者的子集中的电子表格格式的现场临床结构化文本数据(表格数据),以区分患有癌性病变的患者和患有癌前病变(即癌症的直接前驱病变)的患者。我们比较了两种表格深度学习方法和一种传统算法用于预测数据分析。在留出测试集上的实验结果表明所有模型都有良好的表现(约登指数>0.6且曲线下面积>0.9)。此外,我们检查并分析了模型的可解释性。所有模型都表明病变特征是关键的预测特征。从这项工作中获得的见解和结果对于人工智能/机器学习在生物医学中的应用研究社区将是有价值的。