Borji Arezoo, Kronreif Gernot, Angermayr Bernhard, Hatamikia Sepideh
Austrian Center for Medical Innovation and Technology, Wiener Neustadt, Austria.
Research Center for Clinical AI-Research in Omics and Medical Data Science (CAROM), Department of Medicine, Danube Private University (DPU), Krems an der Donau, Austria.
Front Med (Lausanne). 2025 Apr 16;12:1555907. doi: 10.3389/fmed.2025.1555907. eCollection 2025.
Recent advances in machine learning are transforming medical image analysis, particularly in cancer detection and classification. Techniques such as deep learning, especially convolutional neural networks (CNNs) and vision transformers (ViTs), are now enabling the precise analysis of complex histopathological images, automating detection, and enhancing classification accuracy across various cancer types. This study focuses on osteosarcoma (OS), the most common bone cancer in children and adolescents, which affects the long bones of the arms and legs. Early and accurate detection of OS is essential for improving patient outcomes and reducing mortality. However, the increasing prevalence of cancer and the demand for personalized treatments create challenges in achieving precise diagnoses and customized therapies.
We propose a novel hybrid model that combines convolutional neural networks (CNN) and vision transformers (ViT) to improve diagnostic accuracy for OS using hematoxylin and eosin (H&E) stained histopathological images. The CNN model extracts local features, while the ViT captures global patterns from histopathological images. These features are combined and classified using a Multi-Layer Perceptron (MLP) into four categories: non-tumor (NT), non-viable tumor (NVT), viable tumor (VT), and non-viable ratio (NVR).
Using the Cancer Imaging Archive (TCIA) dataset, the model achieved an accuracy of 99.08%, precision of 99.10%, recall of 99.28%, and an F1-score of 99.23%. This is the first successful four-class classification using this dataset, setting a new benchmark in OS research and offering promising potential for future diagnostic advancements.
机器学习的最新进展正在改变医学图像分析,特别是在癌症检测和分类方面。深度学习技术,尤其是卷积神经网络(CNN)和视觉Transformer(ViT),现在能够对复杂的组织病理学图像进行精确分析,实现自动检测,并提高各种癌症类型的分类准确性。本研究聚焦于骨肉瘤(OS),这是儿童和青少年中最常见的骨癌,会影响手臂和腿部的长骨。早期准确检测骨肉瘤对于改善患者预后和降低死亡率至关重要。然而,癌症患病率的上升以及对个性化治疗的需求给实现精确诊断和定制治疗带来了挑战。
我们提出了一种新颖的混合模型,该模型结合了卷积神经网络(CNN)和视觉Transformer(ViT),以使用苏木精和伊红(H&E)染色的组织病理学图像提高骨肉瘤的诊断准确性。CNN模型提取局部特征,而ViT从组织病理学图像中捕捉全局模式。这些特征被组合起来,并使用多层感知器(MLP)分类为四类:非肿瘤(NT)、无活性肿瘤(NVT)、有活性肿瘤(VT)和无活性比例(NVR)。
使用癌症影像存档(TCIA)数据集,该模型的准确率达到99.08%,精确率为99.10%,召回率为99.28%,F1分数为99.23%。这是首次使用该数据集成功进行四类分类,为骨肉瘤研究设定了新的基准,并为未来的诊断进展提供了有前景的潜力。