Department of Radiology, The First Affiliated Hospital of Chongqing Medical University, 400016, China.
Comput Math Methods Med. 2022 Jun 21;2022:7852958. doi: 10.1155/2022/7852958. eCollection 2022.
The most popular test for pneumonia, a serious health threat to children, is chest X-ray imaging. However, the diagnosis of pneumonia relies on the expertise of experienced radiologists, and the scarcity of medical resources has forced us to conduct research on CAD (computer-aided diagnosis). In this study, we propose MP-ViT, the Multisemantic Level Patch Merger Vision Transformer, to achieve automatic diagnosis of pneumonia in chest X-ray images. We introduce Patch Merger to reduce the computational cost of ViT. Meanwhile, the intermediate results calculated by Patch Merger participate in the final classification in a concise way, so as to make full use of the intermediate information of the high-level semantic space to learn from local to overall and to avoid information loss caused by Patch Merger. We conducted experiments on a dataset with 3,883 chest X-ray images described as pneumonia and 1,349 images labeled as normal, and the results show that even without pretraining ViT on a large dataset, our model can achieve the accuracy of 0.91, the precision of 0.92, the recall of 0.89, and the 1-score of 0.90, which is better than Patch Merger on a small dataset. The model can provide CAD for physicians and improve diagnostic reliability.
最常用于诊断儿童健康威胁性疾病——肺炎的方法是胸部 X 光成像。然而,肺炎的诊断依赖于经验丰富的放射科医生的专业知识,医疗资源的稀缺迫使我们对 CAD(计算机辅助诊断)进行研究。在这项研究中,我们提出了 MP-ViT,即多语义层次补丁合并视觉转换器,以实现对胸部 X 光图像中肺炎的自动诊断。我们引入补丁合并来降低 ViT 的计算成本。同时,补丁合并计算的中间结果以简洁的方式参与最终分类,从而充分利用高级语义空间的中间信息,从局部到整体学习,避免补丁合并造成的信息丢失。我们在一个包含 3883 张描述为肺炎的胸部 X 光图像和 1349 张正常标签图像的数据集上进行了实验,结果表明,即使我们的模型没有在大型数据集上对 ViT 进行预训练,它也可以达到 0.91 的准确率、0.92 的精度、0.89 的召回率和 0.90 的 1 分率,优于在小型数据集上的补丁合并。该模型可以为医生提供 CAD,并提高诊断的可靠性。