一项关于电子健康记录中乳腺癌表型自然语言处理算法的跨机构评估。

A cross-institutional evaluation on breast cancer phenotyping NLP algorithms on electronic health records.

作者信息

Zhou Sicheng, Wang Nan, Wang Liwei, Sun Ju, Blaes Anne, Liu Hongfang, Zhang Rui

机构信息

Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA.

School of Statistics, University of Minnesota, Minneapolis, MN, USA.

出版信息

Comput Struct Biotechnol J. 2023 Aug 22;22:32-40. doi: 10.1016/j.csbj.2023.08.018. eCollection 2023.

DOI:10.1016/j.csbj.2023.08.018

PMID:37680211

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10480628/

Abstract

OBJECTIVE

Transformer-based language models are prevailing in the clinical domain due to their excellent performance on clinical NLP tasks. The generalizability of those models is usually ignored during the model development process. This study evaluated the generalizability of CancerBERT, a Transformer-based clinical NLP model, along with classic machine learning models, i.e., conditional random field (CRF), bi-directional long short-term memory CRF (BiLSTM-CRF), across different clinical institutes through a breast cancer phenotype extraction task.

MATERIALS AND METHODS

Two clinical corpora of breast cancer patients were collected from the electronic health records from the University of Minnesota (UMN) and Mayo Clinic (MC), and annotated following the same guideline. We developed three types of NLP models (i.e., CRF, BiLSTM-CRF and CancerBERT) to extract cancer phenotypes from clinical texts. We evaluated the generalizability of models on different test sets with different learning strategies (model transfer vs locally trained). The entity coverage score was assessed with their association with the model performances.

RESULTS

We manually annotated 200 and 161 clinical documents at UMN and MC. The corpora of the two institutes were found to have higher similarity between the target entities than the overall corpora. The CancerBERT models obtained the best performances among the independent test sets from two clinical institutes and the permutation test set. The CancerBERT model developed in one institute and further fine-tuned in another institute achieved reasonable performance compared to the model developed on local data (micro-F1: 0.925 vs 0.932).

CONCLUSIONS

The results indicate the CancerBERT model has superior learning ability and generalizability among the three types of clinical NLP models for our named entity recognition task. It has the advantage to recognize complex entities, e.g., entities with different labels.

摘要

目的

基于Transformer的语言模型因其在临床自然语言处理（NLP）任务中的出色表现而在临床领域盛行。在模型开发过程中，这些模型的可推广性通常被忽视。本研究通过一项乳腺癌表型提取任务，评估了基于Transformer的临床NLP模型CancerBERT以及经典机器学习模型（即条件随机场（CRF）、双向长短期记忆CRF（BiLSTM-CRF））在不同临床机构中的可推广性。

材料与方法

从明尼苏达大学（UMN）和梅奥诊所（MC）的电子健康记录中收集了两个乳腺癌患者的临床语料库，并按照相同的指南进行注释。我们开发了三种类型的NLP模型（即CRF、BiLSTM-CRF和CancerBERT），以从临床文本中提取癌症表型。我们使用不同的学习策略（模型迁移与本地训练）在不同的测试集上评估模型的可推广性。通过实体覆盖率得分及其与模型性能的关联来进行评估。

结果

我们在UMN和MC分别手动注释了200份和161份临床文档。发现两个机构的语料库中目标实体之间的相似度高于整体语料库。CancerBERT模型在来自两个临床机构的独立测试集和排列测试集中表现最佳。与在本地数据上开发的模型相比，在一个机构开发并在另一个机构进一步微调的CancerBERT模型取得了合理的性能（微F1值：0.925对0.932）。

结论

结果表明，在我们的命名实体识别任务中，CancerBERT模型在三种临床NLP模型中具有卓越的学习能力和可推广性。它在识别复杂实体（例如具有不同标签的实体）方面具有优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0204/10480628/5219611f3cc9/ga1.jpg

相似文献

A cross-institutional evaluation on breast cancer phenotyping NLP algorithms on electronic health records.一项关于电子健康记录中乳腺癌表型自然语言处理算法的跨机构评估。

Comput Struct Biotechnol J. 2023 Aug 22;22:32-40. doi: 10.1016/j.csbj.2023.08.018. eCollection 2023.

CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records.CancerBERT：一种癌症领域特定的语言模型，用于从电子健康记录中提取乳腺癌表型。

J Am Med Inform Assoc. 2022 Jun 14;29(7):1208-1216. doi: 10.1093/jamia/ocac040.

Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。

BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.

Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

Named entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training.基于词汇特征的 BiLSTM-CRF 和三训练的中药不良事件报告命名实体识别。

J Biomed Inform. 2019 Aug;96:103252. doi: 10.1016/j.jbi.2019.103252. Epub 2019 Jul 16.

Adversarial active learning for the identification of medical concepts and annotation inconsistency.对抗式主动学习在医学概念识别和标注不一致性中的应用。

J Biomed Inform. 2020 Aug;108:103481. doi: 10.1016/j.jbi.2020.103481. Epub 2020 Jul 18.

A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

Automatic knowledge extraction from Chinese electronic medical records and rheumatoid arthritis knowledge graph construction.从中国电子病历中自动提取知识并构建类风湿性关节炎知识图谱。

Quant Imaging Med Surg. 2023 Jun 1;13(6):3873-3890. doi: 10.21037/qims-22-1158. Epub 2023 May 8.

Clinical Named Entity Recognition From Chinese Electronic Health Records via Machine Learning Methods.基于机器学习方法的中文电子健康记录临床命名实体识别

JMIR Med Inform. 2018 Dec 17;6(4):e50. doi: 10.2196/medinform.9965.

Chinese clinical named entity recognition via multi-head self-attention based BiLSTM-CRF.基于多头自注意力机制的 BiLSTM-CRF 的中文临床命名实体识别。

Artif Intell Med. 2022 May;127:102282. doi: 10.1016/j.artmed.2022.102282. Epub 2022 Mar 18.

引用本文的文献

A text mining-based approach for comprehensive understanding of Chinese railway operational equipment failure reports.一种基于文本挖掘的方法，用于全面理解中国铁路运营设备故障报告。

Sci Rep. 2025 Jul 30;15(1):27760. doi: 10.1038/s41598-025-11622-6.

Clinical applications of large language models in medicine and surgery: A scoping review.大型语言模型在医学与外科中的临床应用：一项范围综述

J Int Med Res. 2025 Jul;53(7):3000605251347556. doi: 10.1177/03000605251347556. Epub 2025 Jul 4.

Deep Learning Model for Natural Language to Assess Effectiveness of Patients With Non-Muscle Invasive Bladder Cancer Receiving Intravesical Bacillus Calmette-Guérin Therapy.用于评估非肌层浸润性膀胱癌患者接受膀胱内卡介苗治疗有效性的自然语言深度学习模型。

JCO Clin Cancer Inform. 2025 Jun;9:e2400249. doi: 10.1200/CCI-24-00249. Epub 2025 Jun 27.

Multimodal deep learning for predicting neoadjuvant treatment outcomes in breast cancer: a systematic review.用于预测乳腺癌新辅助治疗结果的多模态深度学习：一项系统综述

Biol Direct. 2025 Jun 23;20(1):72. doi: 10.1186/s13062-025-00661-8.

Health Care Language Models and Their Fine-Tuning for Information Extraction: Scoping Review.医疗保健语言模型及其在信息提取方面的微调：范围综述。

JMIR Med Inform. 2024 Oct 21;12:e60164. doi: 10.2196/60164.

A taxonomy for advancing systematic error analysis in multi-site electronic health record-based clinical concept extraction.一种用于推进基于多站点电子健康记录的临床概念提取中系统误差分析的分类法。

J Am Med Inform Assoc. 2024 Jun 20;31(7):1493-1502. doi: 10.1093/jamia/ocae101.

本文引用的文献

Generalizability and portability of natural language processing system to extract individual social risk factors.自然语言处理系统提取个体社会风险因素的可推广性和可移植性。

Int J Med Inform. 2023 Sep;177:105115. doi: 10.1016/j.ijmedinf.2023.105115. Epub 2023 Jun 5.

Discovering novel drug-supplement interactions using SuppKG generated from the biomedical literature.利用从生物医学文献中生成的 SuppKG 发现新的药物-补充剂相互作用。

J Biomed Inform. 2022 Jul;131:104120. doi: 10.1016/j.jbi.2022.104120. Epub 2022 Jun 13.

J Am Med Inform Assoc. 2022 Jun 14;29(7):1208-1216. doi: 10.1093/jamia/ocac040.

Quantification of BERT Diagnosis Generalizability Across Medical Specialties Using Semantic Dataset Distance.使用语义数据集距离量化 BERT 在医学专业中的诊断泛化能力。

AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:345-354. eCollection 2021.

Can reproducibility be improved in clinical natural language processing? A study of 7 clinical NLP suites.临床自然语言处理中的可重复性能否提高？对 7 个临床自然语言处理套件的研究。

J Am Med Inform Assoc. 2021 Mar 1;28(3):504-515. doi: 10.1093/jamia/ocaa261.

Clinical concept extraction using transformers.使用转换器进行临床概念提取。

J Am Med Inform Assoc. 2020 Dec 9;27(12):1935-1942. doi: 10.1093/jamia/ocaa189.

Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation.推动自然语言处理（NLP）以加速医疗人工智能发展的需求以及梅奥诊所的NLP即服务实施。

NPJ Digit Med. 2019 Dec 17;2:130. doi: 10.1038/s41746-019-0208-8. eCollection 2019.

Comparison of orthogonal NLP methods for clinical phenotyping and assessment of bone scan utilization among prostate cancer patients.比较正交自然语言处理方法在前列腺癌患者临床表型和骨扫描利用评估中的应用。

J Biomed Inform. 2019 Jun;94:103184. doi: 10.1016/j.jbi.2019.103184. Epub 2019 Apr 20.

Detecting Adverse Drug Events with Rapidly Trained Classification Models.快速训练的分类模型检测药物不良事件。

Drug Saf. 2019 Jan;42(1):147-156. doi: 10.1007/s40264-018-0763-y.

Automating Electronic Clinical Data Capture for Quality Improvement and Research: The CERTAIN Validation Project of Real World Evidence.自动化电子临床数据采集以促进质量改进与研究：真实世界证据的CERTAIN验证项目

EGEMS (Wash DC). 2018 May 22;6(1):8. doi: 10.5334/egems.211.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一项关于电子健康记录中乳腺癌表型自然语言处理算法的跨机构评估。

A cross-institutional evaluation on breast cancer phenotyping NLP algorithms on electronic health records.

作者信息

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

CONCLUSIONS

目的

材料与方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献