Suppr超能文献

结合结构特征从中国文献中获得手术程序的长期认知。

Surgical procedure long terms recognition from Chinese literature incorporating structural feature.

作者信息

Jiale Nan, Gao Dongping, Sun Yuanyuan, Li Xiaoying, Shen Xifeng, Li Meiting, Zhang Weining, Ren Huiling, Qin Yi

机构信息

Institute of Medical Information, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, 100020, China.

出版信息

Heliyon. 2022 Oct 29;8(11):e11291. doi: 10.1016/j.heliyon.2022.e11291. eCollection 2022 Nov.

Abstract

With rapid development of technologies in medical diagnosis and treatment, the novel and complicated concepts and usages of clinical terms especially of surgical procedures have become common in daily routine. Expected to be performed in an operating room and accompanied by an incision based on expert discretion, surgical procedures imply clinical understanding of diagnosis, examination, testing, equipment, drugs and symptoms, etc., but terms expressing surgical procedures are difficult to recognize since the terms are highly distinctive due to long morphological length and complex linguistics phenomena. To achieve higher recognition performance and overcome the challenge of the absence of natural delimiters in Chinese sentences, we propose a Named Entity Recognition (NER) model named Structural-SoftLexicon-Bi-LSTM-CRF (SSBC) empowered by pre-trained model BERT. In particular, we pre-trained a lexicon embedding over large-scale medical corpus to better leverage domain-specific structural knowledge. With input additionally augmented by BERT, rich multigranular information and structural term information is transferred from Structural-SoftLexicon to downstream model Bi-LSTM-CRF. Therefore, we could get a global optimal prediction of input sequence. We evaluate our model on a self-built corpus and results show that SSBC with pre-trained model outperforms other state-of-the-art benchmarks, surpassing at most 3.77% in F1 score. This study hopefully would benefit Diagnostic Related Groups (DRGs) and Diagnosis Intervention Package (DIP) grouping system, medical records statistics and analysis, Medicare payment system, etc.

摘要

随着医学诊断和治疗技术的快速发展,临床术语尤其是外科手术的新颖复杂概念和用法在日常工作中已变得很常见。外科手术预期在手术室进行,并根据专家判断进行切口操作,这意味着要对诊断、检查、测试、设备、药物和症状等有临床理解,但表示外科手术的术语却难以识别,因为这些术语由于形态长度长和语言现象复杂而具有高度独特性。为了实现更高的识别性能并克服中文句子中缺乏自然分隔符的挑战,我们提出了一种名为Structural-SoftLexicon-Bi-LSTM-CRF(SSBC)的命名实体识别(NER)模型,该模型由预训练模型BERT赋能。具体而言,我们在大规模医学语料库上预训练了词嵌入,以更好地利用特定领域的结构知识。通过BERT对输入进行额外增强,丰富的多粒度信息和结构术语信息从Structural-SoftLexicon转移到下游模型Bi-LSTM-CRF。因此,我们可以得到输入序列的全局最优预测。我们在自建语料库上评估了我们的模型,结果表明,带有预训练模型的SSBC优于其他最先进的基准模型,F1分数最多提高了3.77%。本研究有望使诊断相关组(DRGs)和诊断干预包(DIP)分组系统、病历统计与分析、医疗保险支付系统等受益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19d5/9640963/021df261df27/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验