Cheung Eva Y W, Wu Ricky W K, Li Albert S M, Chu Ellie S M
School of Medical and Health Sciences, Tung Wah College, 31 Wylie Road, HoManTin, Hong Kong.
Department of Biological and Biomedical Sciences, School of Health and Life Sciences, Glasgow Caledonian University, Glasgow G4 0BA, UK.
Cancers (Basel). 2023 Oct 19;15(20):5063. doi: 10.3390/cancers15205063.
Glioblastoma (GBM) is one of the most common malignant primary brain tumors, which accounts for 60-70% of all gliomas. Conventional diagnosis and the decision of post-operation treatment plan for glioblastoma is mainly based on the feature-based qualitative analysis of hematoxylin and eosin-stained (H&E) histopathological slides by both an experienced medical technologist and a pathologist. The recent development of digital whole slide scanners makes AI-based histopathological image analysis feasible and helps to diagnose cancer by accurately counting cell types and/or quantitative analysis. However, the technology available for digital slide image analysis is still very limited. This study aimed to build an image feature-based computer model using histopathology whole slide images to differentiate patients with glioblastoma (GBM) from healthy control (HC).
Two independent cohorts of patients were used. The first cohort was composed of 262 GBM patients of the Cancer Genome Atlas Glioblastoma Multiform Collection (TCGA-GBM) dataset from the cancer imaging archive (TCIA) database. The second cohort was composed of 60 GBM patients collected from a local hospital. Also, a group of 60 participants with no known brain disease were collected. All the H&E slides were collected. Thirty-three image features (22 GLCM and 11 GLRLM) were retrieved from the tumor volume delineated by medical technologist on H&E slides. Five machine-learning algorithms including decision-tree (DT), extreme-boost (EB), support vector machine (SVM), random forest (RF), and linear model (LM) were used to build five models using the image features extracted from the first cohort of patients. Models built were deployed using the selected key image features for GBM diagnosis from the second cohort (local patients) as model testing, to identify and verify key image features for GBM diagnosis.
All five machine learning algorithms demonstrated excellent performance in GBM diagnosis and achieved an overall accuracy of 100% in the training and validation stage. A total of 12 GLCM and 3 GLRLM image features were identified and they showed a significant difference between the normal and the GBM image. However, only the SVM model maintained its excellent performance in the deployment of the models using the independent local cohort, with an accuracy of 93.5%, sensitivity of 86.95%, and specificity of 99.73%.
In this study, we have identified 12 GLCM and 3 GLRLM image features which can aid the GBM diagnosis. Among the five models built, the SVM model proposed in this study demonstrated excellent accuracy with very good sensitivity and specificity. It could potentially be used for GBM diagnosis and future clinical application.
胶质母细胞瘤(GBM)是最常见的原发性恶性脑肿瘤之一,占所有胶质瘤的60 - 70%。胶质母细胞瘤的传统诊断及术后治疗方案的确定主要基于经验丰富的医学技术人员和病理学家对苏木精和伊红染色(H&E)组织病理切片进行的基于特征的定性分析。数字全切片扫描仪的最新发展使基于人工智能的组织病理图像分析成为可能,并有助于通过准确计数细胞类型和/或定量分析来诊断癌症。然而,可用于数字切片图像分析的技术仍然非常有限。本研究旨在使用组织病理学全切片图像构建基于图像特征的计算机模型,以区分胶质母细胞瘤(GBM)患者与健康对照(HC)。
使用了两个独立的患者队列。第一个队列由来自癌症成像存档(TCIA)数据库的癌症基因组图谱多形性胶质母细胞瘤数据集(TCGA - GBM)中的262例GBM患者组成。第二个队列由从当地医院收集的60例GBM患者组成。此外,还收集了一组60名无已知脑部疾病的参与者。收集了所有的H&E切片。从医学技术人员在H&E切片上勾勒出的肿瘤体积中提取了33个图像特征(22个灰度共生矩阵特征和11个灰度游程长度矩阵特征)。使用包括决策树(DT)、极限梯度提升(EB)、支持向量机(SVM)、随机森林(RF)和线性模型(LM)在内的五种机器学习算法,利用从第一组患者中提取的图像特征构建了五个模型。使用从第二组(当地患者)中选择的用于GBM诊断的关键图像特征对构建的模型进行部署,作为模型测试,以识别和验证用于GBM诊断的关键图像特征。
所有五种机器学习算法在GBM诊断中均表现出优异的性能,在训练和验证阶段的总体准确率达到100%。共识别出12个灰度共生矩阵特征和3个灰度游程长度矩阵特征,它们在正常图像和GBM图像之间显示出显著差异。然而,只有支持向量机模型在使用独立的当地队列进行模型部署时保持了其优异的性能,准确率为93.5%,灵敏度为86.95%,特异性为99.73%。
在本研究中,我们识别出了12个灰度共生矩阵特征和3个灰度游程长度矩阵特征,它们有助于GBM诊断。在构建的五个模型中,本研究提出的支持向量机模型表现出了优异的准确率,同时具有很好的灵敏度和特异性。它有可能用于GBM诊断及未来的临床应用。