Key Laboratory of Acupuncture and Medicine Research of Ministry of Education, Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing 210023, China.
School of Medical Information and Engineering, Xuzhou Medical University, 209 Tongshan Road, Xuzhou 221004, China.
Database (Oxford). 2024 Aug 30;2024. doi: 10.1093/database/baae083.
In acupuncture diagnosis and treatment, non-quantitative clinical descriptions have limited the development of standardized treatment methods. This study explores the effectiveness and the reasons for discrepancies in the entity recognition and classification of meridians in acupuncture indication using the Acupuncture Bidirectional Encoder Representations from Transformers (ACUBERT) model. During the research process, we selected 54 593 different entities from 82 acupuncture medical books as the pretraining corpus for medical literature, conducting classification research on Chinese medical literature using the BERT model. Additionally, we employed the support vector machine and Random Forest models as comparative benchmarks and optimized them through parameter tuning, ultimately leading to the development of the ACUBERT model. The results show that the ACUBERT model outperforms other baseline models in classification effectiveness, achieving the best performance at Epoch = 5. The model's "precision," "recall," and F1 scores reached above 0.8. Moreover, our study has a unique feature: it trains the meridian differentiation model based on the eight principles of differentiation and zang-fu differentiation as foundational labels. It establishes an acupuncture-indication knowledge base (ACU-IKD) and ACUBERT model with traditional Chinese medicine characteristics. In summary, the ACUBERT model significantly enhances the classification effectiveness of meridian attribution in the acupuncture indication database and also demonstrates the classification advantages of deep learning methods based on BERT in multi-category, large-scale training sets. Database URL: http://acuai.njucm.edu.cn:8081/#/user/login?tenantUrl=default.
在针灸诊断和治疗中,非定量的临床描述限制了标准化治疗方法的发展。本研究使用 ACUBERT 模型探索了针灸适应证中经络实体识别和分类的有效性和差异的原因。在研究过程中,我们从 82 本针灸医学书籍中选择了 54593 个不同的实体作为医学文献的预训练语料库,使用 BERT 模型对中文医学文献进行分类研究。此外,我们还使用支持向量机和随机森林模型作为比较基准,并通过参数调整对它们进行优化,最终开发了 ACUBERT 模型。结果表明,ACUBERT 模型在分类效果方面优于其他基线模型,在 Epoch=5 时达到最佳性能。该模型的“精度”、“召回率”和 F1 得分均高于 0.8。此外,我们的研究具有独特的特点:它基于八纲辨证和脏腑辨证等基础标签来训练经络辨证模型。它建立了一个具有中医特色的针灸适应证知识库(ACU-IKD)和 ACUBERT 模型。总之,ACUBERT 模型显著提高了针灸适应证数据库中经络属性的分类效果,也展示了基于 BERT 的深度学习方法在多类别、大规模训练集上的分类优势。数据库网址:http://acuai.njucm.edu.cn:8081/#/user/login?tenantUrl=default。