Liang Daniel D, Liang David D, Pomeroy Marc J, Gao Yongfeng, Kuo Licheng R, Li Lihong C
Ward Melville High School, East Setauket, New York, United States.
University of Chicago, Department of Computer Science, Chicago, Illinois, United States.
J Med Imaging (Bellingham). 2024 Jul;11(4):044501. doi: 10.1117/1.JMI.11.4.044501. Epub 2024 Jul 9.
Medical imaging-based machine learning (ML) for computer-aided diagnosis of lesions consists of two basic components or modules of (i) feature extraction from non-invasively acquired medical images and (ii) feature classification for prediction of malignancy of lesions detected or localized in the medical images. This study investigates their individual performances for diagnosis of low-dose computed tomography (CT) screening-detected lesions of pulmonary nodules and colorectal polyps.
Three feature extraction methods were investigated. One uses the mathematical descriptor of gray-level co-occurrence image texture measure to extract the Haralick image texture features (HFs). One uses the convolutional neural network (CNN) architecture to extract deep learning (DL) image abstractive features (DFs). The third one uses the interactions between lesion tissues and X-ray energy of CT to extract tissue-energy specific characteristic features (TFs). All the above three categories of extracted features were classified by the random forest (RF) classifier with comparison to the DL-CNN method, which reads the images, extracts the DFs, and classifies the DFs in an end-to-end manner. The ML diagnosis of lesions or prediction of lesion malignancy was measured by the area under the receiver operating characteristic curve (AUC). Three lesion image datasets were used. The lesions' tissue pathological reports were used as the learning labels.
Experiments on the three datasets produced AUC values of 0.724 to 0.878 for the HFs, 0.652 to 0.965 for the DFs, and 0.985 to 0.996 for the TFs, compared to the DL-CNN of 0.694 to 0.964. These experimental outcomes indicate that the RF classifier performed comparably to the DL-CNN classification module and the extraction of tissue-energy specific characteristic features dramatically improved AUC value.
The feature extraction module is more important than the feature classification module. Extraction of tissue-energy specific characteristic features is more important than extraction of image abstractive and characteristic features.
基于医学成像的机器学习(ML)用于病变的计算机辅助诊断,由两个基本组件或模块组成:(i)从非侵入性获取的医学图像中提取特征,以及(ii)对医学图像中检测到或定位的病变的恶性程度进行预测的特征分类。本研究调查了它们在诊断低剂量计算机断层扫描(CT)筛查检测到的肺结节和结肠息肉病变中的各自表现。
研究了三种特征提取方法。一种使用灰度共生图像纹理测量的数学描述符来提取哈拉里克图像纹理特征(HFs)。一种使用卷积神经网络(CNN)架构来提取深度学习(DL)图像抽象特征(DFs)。第三种方法利用病变组织与CT的X射线能量之间的相互作用来提取组织能量特定特征(TFs)。上述三类提取的特征均由随机森林(RF)分类器进行分类,并与DL-CNN方法进行比较,DL-CNN方法以端到端的方式读取图像、提取DFs并对DFs进行分类。病变的ML诊断或病变恶性程度的预测通过受试者操作特征曲线(AUC)下的面积来衡量。使用了三个病变图像数据集。病变的组织病理报告用作学习标签。
在三个数据集上进行的实验中,HFs的AUC值为0.724至0.878,DFs的AUC值为0.652至0.965,TFs的AUC值为0.985至0.996,而DL-CNN的AUC值为0.694至0.964。这些实验结果表明,RF分类器的表现与DL-CNN分类模块相当,并且组织能量特定特征的提取显著提高了AUC值。
特征提取模块比特征分类模块更重要。组织能量特定特征的提取比图像抽象和特征特征的提取更重要。