Genies Beijing Co., Ltd., Beijing 100102, China.
Department of Thoracic Surgery, Hainan General Hospital, Haikou, Hainan 570311, China.
Comput Math Methods Med. 2022 May 2;2022:3151554. doi: 10.1155/2022/3151554. eCollection 2022.
Imbalanced classes and dimensional disasters are critical challenges in medical image classification. As a classical machine learning model, the -gram model has shown excellent performance in addressing this issue in text classification. In this study, we proposed an algorithm to classify medical images by extracting their -gram semantic features. This algorithm first converts an image classification problem to a text classification problem by building an -gram corpus for an image. After that, the algorithm was based on the -gram model to classify images. The algorithm was evaluated by two independent public datasets. The first experiment is to diagnose benign and malignant thyroid nodules. The best area under the curve (AUC) is 0.989. The second experiment is to diagnose the type of fundus lesion. The best result is that it correctly identified 86.667% of patients with dry age-related macular degeneration (AMD), 93.333% of patients with diabetic macular edema (DME), and 93.333% of normal individuals.
不平衡类和维度灾难是医学图像分类中的关键挑战。作为一种经典的机器学习模型,n-gram 模型在文本分类中已经显示出了出色的性能。在这项研究中,我们提出了一种通过提取 n-gram 语义特征来对医学图像进行分类的算法。该算法首先通过为图像构建 n-gram 语料库,将图像分类问题转换为文本分类问题。然后,该算法基于 n-gram 模型对图像进行分类。该算法通过两个独立的公共数据集进行评估。第一个实验是诊断良性和恶性甲状腺结节,最佳曲线下面积(AUC)为 0.989。第二个实验是诊断眼底病变类型,正确识别干性年龄相关性黄斑变性(AMD)患者的比例为 86.667%,糖尿病性黄斑水肿(DME)患者为 93.333%,正常人为 93.333%。