Liu Wensu, Lv Na, Wan Jing, Wang Lu, Zhou Xiaobei
Key Laboratory of Obesity and Glucose/Lipid Associated Metabolic Diseases, China Medical University, Shenyang, Liaoning, 110122, China.
Institute of Health Sciences, China Medical University, Shenyang, Liaoning, 110122, China.
Heliyon. 2024 Aug 13;10(16):e36191. doi: 10.1016/j.heliyon.2024.e36191. eCollection 2024 Aug 30.
In our paper, we present an extension of text embedding architectures for grayscale medical image classification. We introduce a mechanism that combines n-gram features with an efficient pixel flattening technique to preserve spatial information during feature representation generation. Our approach involves flattening all pixels in grayscale medical images using a combination of column-wise, row-wise, diagonal-wise, and anti-diagonal-wise orders. This ensures that spatial dependencies are captured effectively in the feature representations. To evaluate the effectiveness of our method, we conducted a benchmark using 5 grayscale medical image datasets of varying sizes and complexities. 10-fold cross-validation showed that our approach achieved test accuracy score of 99.92 % on the Medical MNIST dataset, 90.06 % on the Chest X-ray Pneumonia dataset, 96.94 % on the Curated Covid CT dataset, 79.11 % on the MIAS dataset and 93.17 % on the Ultrasound dataset. The framework and reproducible code can be found on GitHub at https://github.com/xizhou/pixel_embedding.
在我们的论文中,我们展示了用于灰度医学图像分类的文本嵌入架构的扩展。我们引入了一种机制,该机制将n元语法特征与高效的像素展平技术相结合,以在特征表示生成过程中保留空间信息。我们的方法包括使用按列、按行、对角线和反对角线顺序的组合来展平灰度医学图像中的所有像素。这确保了在特征表示中有效地捕获空间依赖性。为了评估我们方法的有效性,我们使用了5个不同大小和复杂度的灰度医学图像数据集进行基准测试。10折交叉验证表明,我们的方法在医学MNIST数据集上的测试准确率得分为99.92%,在胸部X光肺炎数据集上为90.06%,在精选的新冠CT数据集上为96.94%,在MIAS数据集上为79.11%,在超声数据集上为93.17%。该框架和可重现代码可在GitHub上的https://github.com/xizhou/pixel_embedding找到。