Dehbozorgi Pegah, Ryabchykov Oleg, Bocklitz Thomas W
Leibniz Institute of Photonic Technology, Member of Leibniz Health Technologies, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Albert-Einstein-Strasse 9, 07745, Jena, Germany; Institute of Physical Chemistry (IPC) and Abbe Center of Photonics (ACP), Friedrich Schiller University Jena, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Helmholtzweg 4, 07743, Jena, Germany.
Leibniz Institute of Photonic Technology, Member of Leibniz Health Technologies, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Albert-Einstein-Strasse 9, 07745, Jena, Germany; Institute of Physical Chemistry (IPC) and Abbe Center of Photonics (ACP), Friedrich Schiller University Jena, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Helmholtzweg 4, 07743, Jena, Germany.
Comput Biol Med. 2025 Mar;187:109768. doi: 10.1016/j.compbiomed.2025.109768. Epub 2025 Jan 31.
Feature extraction in ML plays a crucial role in transforming raw data into a more meaningful and interpretable representation. In this study, we thoroughly examined a range of feature extraction techniques and assessed their impact on the binary classification models for medical images, utilizing a diverse and rich set of medical imaging modalities. Using H&E-stained, chest X-ray, and retina OCT images, we applied methods to extract statistical, radiomics, and deep features. These features were then used to develop PCA-LDA models as the employed classifier. We evaluated the models based on two decisive metrics: latency and performance. Latency measured the time taken for feature extraction and prediction, while mean sensitivity (balanced accuracy) characterizes the model performance. Our comparative study revealed that statistical and radiomics features were less effective for medical image classification, as they showed high latency and lower performance scores. In contrast, pre-trained DL networks performed efficiently, with high sensitivity and low latency. For H&E-stained images, the statistical feature extraction took about an hour and achieved 90.8 % sensitivity, while ResNet50 reduced processing time fourfold and increased sensitivity to 96.9 %. For chest X-rays, radiomics features were time-intensive with 92.2 % sensitivity, while ResNet50 improved sensitivity to 96 % with faster extraction time. For retina OCT images, radiomics yielded a sensitivity of 91 %, while DenseNet121 achieved 98.6 % sensitivity in 15 min. These findings underscore the superior performance of DL techniques over the statistical and radiomics features, highlighting their potential for real-world applications where accurate and rapid diagnostic decisions are essential.
机器学习中的特征提取在将原始数据转换为更有意义且可解释的表示形式方面起着至关重要的作用。在本研究中,我们全面考察了一系列特征提取技术,并利用丰富多样的医学成像模态评估了它们对医学图像二分类模型的影响。我们使用苏木精 - 伊红染色、胸部X光和视网膜光学相干断层扫描(OCT)图像,应用方法提取统计特征、放射组学特征和深度特征。然后将这些特征用于开发主成分分析 - 线性判别分析(PCA - LDA)模型作为所用的分类器。我们基于两个决定性指标评估模型:延迟和性能。延迟衡量特征提取和预测所需的时间,而平均灵敏度(平衡准确率)表征模型性能。我们的对比研究表明,统计特征和放射组学特征在医学图像分类中效果较差,因为它们显示出高延迟和较低的性能分数。相比之下,预训练的深度学习网络表现高效,具有高灵敏度和低延迟。对于苏木精 - 伊红染色图像,统计特征提取耗时约一小时,灵敏度达到90.8%,而ResNet50将处理时间缩短了四倍,并将灵敏度提高到96.9%。对于胸部X光,放射组学特征耗时且灵敏度为92.2%,而ResNet50在更快的提取时间下将灵敏度提高到96%。对于视网膜OCT图像,放射组学的灵敏度为91%,而DenseNet121在15分钟内达到了98.6%的灵敏度。这些发现强调了深度学习技术相对于统计特征和放射组学特征的卓越性能,突出了它们在需要准确快速诊断决策的实际应用中的潜力。