School of Statistics and Information, Shanghai University of International Business and Economics, Shanghai, 201620, China.
China National Institute of Standardization, Beijing, China.
BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):118. doi: 10.1186/s12911-020-1108-1.
A semi-supervised model is proposed for extracting clinical terms of Traditional Chinese Medicine using feature words.
The extraction model is based on BiLSTM-CRF and combined with semi-supervised learning and feature word set, which reduces the cost of manual annotation and leverage extraction results.
Experiment results show that the proposed model improves the extraction of five types of TCM clinical terms, including traditional Chinese medicine, symptoms, patterns, diseases and formulas. The best F1-value of the experiment reaches 78.70% on the test dataset.
This method can reduce the cost of manual labeling and improve the result in the NER research of TCM clinical terms.
提出了一种基于特征词的半监督模型,用于提取中医药临床术语。
该提取模型基于 BiLSTM-CRF,并结合了半监督学习和特征词集,降低了人工标注的成本,并利用了提取结果。
实验结果表明,所提出的模型提高了五种类型的中医药临床术语的提取效果,包括中药、症状、证候、疾病和方剂。实验的最佳 F1 值在测试数据集上达到 78.70%。
该方法可以降低人工标注的成本,并提高中医药临床术语的命名实体识别研究中的结果。