• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

超越标记:对法语大型语言模型进行临床命名实体识别的公平评估。

Beyond Tokens: Fair Evaluation of French Large Language Models for Clinical Named Entity Recognition.

机构信息

Division of Medical Information Sciences, Geneva University Hospitals, Geneva, Switzerland.

Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland.

出版信息

Stud Health Technol Inform. 2024 Aug 22;316:666-670. doi: 10.3233/SHTI240502.

DOI:10.3233/SHTI240502
PMID:39176830
Abstract

Named Entity Recognition (NER) models based on Transformers have gained prominence for their impressive performance in various languages and domains. This work delves into the often-overlooked aspect of entity-level metrics and exposes significant discrepancies between token and entity-level evaluations. The study utilizes a corpus of synthetic French oncological reports annotated with entities representing oncological morphologies. Four different French BERT-based models are fine-tuned for token classification, and their performance is rigorously assessed at both token and entity-level. In addition to fine-tuning, we evaluate ChatGPT's ability to perform NER through prompt engineering techniques. The findings reveal a notable disparity in model effectiveness when transitioning from token to entity-level metrics, highlighting the importance of comprehensive evaluation methodologies in NER tasks. Furthermore, in comparison to BERT, ChatGPT remains limited when it comes to detecting advanced entities in French.

摘要

基于转换器的命名实体识别 (NER) 模型因其在各种语言和领域中的出色表现而备受关注。这项工作深入研究了实体级别的指标这一经常被忽视的方面,并揭示了标记和实体级别的评估之间存在显著差异。该研究使用了一个带有代表肿瘤形态的实体的合成法语肿瘤学报告语料库。我们对四个不同的基于法语 BERT 的模型进行了微调,以进行标记分类,并在标记和实体级别上对其性能进行了严格评估。除了微调,我们还通过提示工程技术评估了 ChatGPT 执行 NER 的能力。研究结果表明,从标记级别到实体级别度量标准的模型效果存在显著差异,这突出了在 NER 任务中采用全面评估方法的重要性。此外,与 BERT 相比,ChatGPT 在检测法语中的高级实体方面仍然存在局限性。

相似文献

1
Beyond Tokens: Fair Evaluation of French Large Language Models for Clinical Named Entity Recognition.超越标记:对法语大型语言模型进行临床命名实体识别的公平评估。
Stud Health Technol Inform. 2024 Aug 22;316:666-670. doi: 10.3233/SHTI240502.
2
Evaluating Medical Entity Recognition in Health Care: Entity Model Quantitative Study.评估医疗保健中的实体识别:实体模型定量研究。
JMIR Med Inform. 2024 Oct 17;12:e59782. doi: 10.2196/59782.
3
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
4
Evaluation of clinical named entity recognition methods for Serbian electronic health records.评估塞尔维亚电子健康记录中的临床命名实体识别方法。
Int J Med Inform. 2022 Aug;164:104805. doi: 10.1016/j.ijmedinf.2022.104805. Epub 2022 May 25.
5
Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)-Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study.基于大规模电子健康记录笔记对基于变换器的双向编码器表征(BERT)模型进行微调:一项实证研究。
JMIR Med Inform. 2019 Sep 12;7(3):e14830. doi: 10.2196/14830.
6
Improving large language models for clinical named entity recognition via prompt engineering.通过提示工程改进临床命名实体识别的大型语言模型。
J Am Med Inform Assoc. 2024 Sep 1;31(9):1812-1820. doi: 10.1093/jamia/ocad259.
7
A Fine-Tuned Bidirectional Encoder Representations From Transformers Model for Food Named-Entity Recognition: Algorithm Development and Validation.基于 Transformer 的双向编码器表示模型的精细调整在食品命名实体识别中的应用:算法开发与验证。
J Med Internet Res. 2021 Aug 9;23(8):e28229. doi: 10.2196/28229.
8
Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study.用于命名实体识别任务的大语言模型微调的样本量考量:方法学研究
JMIR AI. 2024 May 16;3:e52095. doi: 10.2196/52095.
9
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
10
Comparing NER Approaches on French Clinical Text, with Easy-to-Reuse Pipelines.比较法语临床文本的命名实体识别方法,使用易于重用的管道。
Stud Health Technol Inform. 2024 Aug 22;316:272-276. doi: 10.3233/SHTI240396.