临床文本的词性标注：机构之间的壁垒还是桥梁？

Part-of-speech tagging for clinical text: wall or bridge between institutions?

作者信息

Fan Jung-wei, Prasad Rashmi, Yabut Rommel M, Loomis Richard M, Zisook Daniel S, Mattison John E, Huang Yang

机构信息

Kaiser Permanente Southern California, Pasadena, CA, USA.

出版信息

AMIA Annu Symp Proc. 2011;2011:382-91. Epub 2011 Oct 22.

PMID:22195091

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3243258/

Abstract

Part-of-speech (POS) tagging is a fundamental step required by various NLP systems. The training of a POS tagger relies on sufficient quality annotations. However, the annotation process is both knowledge-intensive and time-consuming in the clinical domain. A promising solution appears to be for institutions to share their annotation efforts, and yet there is little research on associated issues. We performed experiments to understand how POS tagging performance would be affected by using a pre-trained tagger versus raw training data across different institutions. We manually annotated a set of clinical notes at Kaiser Permanente Southern California (KPSC) and a set from the University of Pittsburg Medical Center (UPMC), and trained/tested POS taggers with intra- and inter-institution settings. The cTAKES POS tagger was also included in the comparison to represent a tagger partially trained from the notes of a third institution, Mayo Clinic at Rochester. Intra-institution 5-fold cross-validation estimated an accuracy of 0.953 and 0.945 on the KPSC and UPMC notes respectively. Trained purely on KPSC notes, the accuracy was 0.897 when tested on UPMC notes. Trained purely on UPMC notes, the accuracy was 0.904 when tested on KPSC notes. Applying the cTAKES tagger pre-trained with Mayo Clinic's notes, the accuracy was 0.881 on KPSC notes and 0.883 on UPMC notes. After adding UPMC annotations to KPSC training data, the average accuracy on tested KPSC notes increased to 0.965. After adding KPSC annotations to UPMC training data, the average accuracy on tested UPMC notes increased to 0.953. The results indicated: first, the performance of pre-trained POS taggers dropped about 5% when applied directly across the institutions; second, mixing annotations from another institution following the same guideline increased tagging accuracy for about 1%. Our findings suggest that institutions can benefit more from sharing raw annotations but less from sharing pre-trained models for the POS tagging task. We believe the study could also provide general insights on cross-institution data sharing for other types of NLP tasks.

摘要

词性标注（POS）是各种自然语言处理（NLP）系统所需的一个基本步骤。词性标注器的训练依赖于足够高质量的标注。然而，在临床领域，标注过程既需要大量知识又耗时。一个有前景的解决方案似乎是各机构共享他们的标注工作，但相关问题的研究却很少。我们进行了实验，以了解在不同机构中使用预训练的标注器与原始训练数据相比，词性标注性能会受到怎样的影响。我们手动标注了南加州凯撒永久医疗集团（KPSC）的一组临床记录以及匹兹堡大学医学中心（UPMC）的一组临床记录，并在机构内部和机构间设置下训练/测试词性标注器。比较中还包括了cTAKES词性标注器，以代表一个部分从罗切斯特梅奥诊所的记录中训练得到的标注器。机构内部5折交叉验证估计在KPSC和UPMC记录上的准确率分别为0.953和0.945。仅在KPSC记录上训练，在UPMC记录上测试时准确率为0.897。仅在UPMC记录上训练，在KPSC记录上测试时准确率为0.904。应用用梅奥诊所记录预训练的cTAKES标注器，在KPSC记录上的准确率为0.881，在UPMC记录上为0.883。在KPSC训练数据中添加UPMC标注后，测试的KPSC记录上的平均准确率提高到了0.965。在UPMC训练数据中添加KPSC标注后，测试的UPMC记录上的平均准确率提高到了0.953。结果表明：第一，直接跨机构应用预训练的词性标注器时，其性能下降约5%；第二，按照相同准则混合来自另一个机构的标注可将标注准确率提高约1%。我们的研究结果表明，对于词性标注任务，各机构从共享原始标注中能获得更多益处，而从共享预训练模型中获得的益处较少。我们相信该研究也能为其他类型的NLP任务的跨机构数据共享提供一般性见解。

相似文献

Part-of-speech tagging for clinical text: wall or bridge between institutions?临床文本的词性标注：机构之间的壁垒还是桥梁？

AMIA Annu Symp Proc. 2011;2011:382-91. Epub 2011 Oct 22.

Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation.通过领域自适应提高临床叙述自然语言处理词性标注的性能。

J Am Med Inform Assoc. 2013 Sep-Oct;20(5):931-9. doi: 10.1136/amiajnl-2012-001453. Epub 2013 Mar 13.

A token centric part-of-speech tagger for biomedical text.一种用于生物医学文本的以词元为中心的词性标注器。

Artif Intell Med. 2014 May;61(1):11-20. doi: 10.1016/j.artmed.2014.03.005. Epub 2014 Mar 26.

Developing a corpus of clinical notes manually annotated for part-of-speech.开发一个词性人工标注的临床笔记语料库。

Int J Med Inform. 2006 Jun;75(6):418-29. doi: 10.1016/j.ijmedinf.2005.08.006. Epub 2005 Sep 19.

Heuristic sample selection to minimize reference standard training set for a part-of-speech tagger.用于最小化词性标注器参考标准训练集的启发式样本选择。

J Am Med Inform Assoc. 2007 Sep-Oct;14(5):641-50. doi: 10.1197/jamia.M2392. Epub 2007 Jun 28.

The use of natural language processing to identify vaccine-related anaphylaxis at five health care systems in the Vaccine Safety Datalink.利用自然语言处理技术在疫苗安全数据链中的五个医疗系统中识别与疫苗相关的过敏反应。

Pharmacoepidemiol Drug Saf. 2020 Feb;29(2):182-188. doi: 10.1002/pds.4919. Epub 2019 Dec 3.

Multilingual part-of-speech tagging with weightless neural networks.使用无权重神经网络进行多语言词性标注。

Neural Netw. 2015 Jun;66:11-21. doi: 10.1016/j.neunet.2015.02.012. Epub 2015 Mar 2.

A study of deep learning methods for de-identification of clinical notes in cross-institute settings.深度学习方法在跨机构环境下对临床记录进行去识别的研究。

BMC Med Inform Decis Mak. 2019 Dec 5;19(Suppl 5):232. doi: 10.1186/s12911-019-0935-4.

Performance analysis of a POS tagger applied to discharge summaries in Portuguese.应用于葡萄牙语出院小结的词性标注器性能分析。

Stud Health Technol Inform. 2010;160(Pt 2):959-63.

A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text.一个用于临床文本的细粒度中文分词和词性标注语料库。

BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):66. doi: 10.1186/s12911-019-0770-7.

引用本文的文献

Contextual Variation of Clinical Notes induced by EHR Migration.临床记录因电子病历迁移而产生的语境变化。

AMIA Annu Symp Proc. 2024 Jan 11;2023:1155-1164. eCollection 2023.

A cross-institutional evaluation on breast cancer phenotyping NLP algorithms on electronic health records.一项关于电子健康记录中乳腺癌表型自然语言处理算法的跨机构评估。

Comput Struct Biotechnol J. 2023 Aug 22;22:32-40. doi: 10.1016/j.csbj.2023.08.018. eCollection 2023.

Development of an Open-Source Annotated Glaucoma Medication Dataset From Clinical Notes in the Electronic Health Record.从电子健康记录中的临床记录中开发开源标注青光眼药物数据集。

Transl Vis Sci Technol. 2022 Nov 1;11(11):20. doi: 10.1167/tvst.11.11.20.

Multicenter Validation of Natural Language Processing Algorithms for the Detection of Common Data Elements in Operative Notes for Total Hip Arthroplasty: Algorithm Development and Validation.用于检测全髋关节置换术手术记录中常见数据元素的自然语言处理算法的多中心验证：算法开发与验证

JMIR Med Inform. 2022 Aug 31;10(8):e38155. doi: 10.2196/38155.

Knowledge-Infused Abstractive Summarization of Clinical Diagnostic Interviews: Framework Development Study.临床诊断访谈的知识注入式摘要生成：框架开发研究

JMIR Ment Health. 2021 May 10;8(5):e20865. doi: 10.2196/20865.

Ensembles of natural language processing systems for portable phenotyping solutions.用于便携表型解决方案的自然语言处理系统集合。

J Biomed Inform. 2019 Dec;100:103318. doi: 10.1016/j.jbi.2019.103318. Epub 2019 Oct 23.

A Comprehensive Review of Computational Methods for Automatic Prediction of Schizophrenia With Insight Into Indigenous Populations.对精神分裂症自动预测计算方法的全面综述：深入了解原住民群体

Front Psychiatry. 2019 Sep 12;10:659. doi: 10.3389/fpsyt.2019.00659. eCollection 2019.

An efficient prototype method to identify and correct misspellings in clinical text.一种用于识别和纠正临床文本中拼写错误的高效原型方法。

BMC Res Notes. 2019 Jan 18;12(1):42. doi: 10.1186/s13104-019-4073-y.

Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions.临床文档差异与自然语言处理系统的可移植性：跨机构哮喘出生队列的案例研究

J Am Med Inform Assoc. 2018 Mar 1;25(3):353-359. doi: 10.1093/jamia/ocx138.

CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.CLAMP - 一个用于高效构建定制化临床自然语言处理管道的工具包。

J Am Med Inform Assoc. 2018 Mar 1;25(3):331-336. doi: 10.1093/jamia/ocx132.

本文引用的文献

Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies.基于领域特定术语的生物医学概率句法语义语法推导。

J Biomed Inform. 2011 Oct;44(5):805-14. doi: 10.1016/j.jbi.2011.04.006. Epub 2011 Apr 28.

A practical method for transforming free-text eligibility criteria into computable criteria.一种将自由文本资格标准转化为可计算标准的实用方法。

J Biomed Inform. 2011 Apr;44(2):239-50. doi: 10.1016/j.jbi.2010.09.007. Epub 2010 Sep 17.

Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.梅奥临床文本分析和知识提取系统（cTAKES）：架构、组件评估和应用。

J Am Med Inform Assoc. 2010 Sep-Oct;17(5):507-13. doi: 10.1136/jamia.2009.001560.

What can natural language processing do for clinical decision support?自然语言处理能为临床决策支持做些什么？

J Biomed Inform. 2009 Oct;42(5):760-72. doi: 10.1016/j.jbi.2009.08.007. Epub 2009 Aug 13.

Heuristic sample selection to minimize reference standard training set for a part-of-speech tagger.用于最小化词性标注器参考标准训练集的启发式样本选择。

J Am Med Inform Assoc. 2007 Sep-Oct;14(5):641-50. doi: 10.1197/jamia.M2392. Epub 2007 Jun 28.

Computerized extraction of information on the quality of diabetes care from free text in electronic patient records of general practitioners.从全科医生电子病历中的自由文本中计算机化提取糖尿病护理质量信息。

J Am Med Inform Assoc. 2007 May-Jun;14(3):349-54. doi: 10.1197/jamia.M2128. Epub 2007 Feb 28.

dTagger: a POS tagger.dTagger：一种词性标注器。

AMIA Annu Symp Proc. 2006;2006:200-3.

RelEx--relation extraction using dependency parse trees.RelEx——使用依存句法分析树进行关系抽取。

Bioinformatics. 2007 Feb 1;23(3):365-71. doi: 10.1093/bioinformatics/btl616. Epub 2006 Dec 1.

Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system.提取用于哮喘研究的主要诊断、合并症和吸烟状况：自然语言处理系统的评估

BMC Med Inform Decis Mak. 2006 Jul 26;6:30. doi: 10.1186/1472-6947-6-30.

Domain-specific language models and lexicons for tagging.用于标记的特定领域语言模型和词汇表。

J Biomed Inform. 2005 Dec;38(6):422-30. doi: 10.1016/j.jbi.2005.02.009. Epub 2005 Apr 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验