• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Utilizing active learning strategies in machine-assisted annotation for clinical named entity recognition: a comprehensive analysis considering annotation costs and target effectiveness.利用主动学习策略在机器辅助标注中进行临床命名实体识别:考虑标注成本和目标效果的综合分析。
J Am Med Inform Assoc. 2024 Nov 1;31(11):2632-2640. doi: 10.1093/jamia/ocae197.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
From BERT to generative AI - Comparing encoder-only vs. large language models in a cohort of lung cancer patients for named entity recognition in unstructured medical reports.从BERT到生成式人工智能——在一组肺癌患者中比较仅编码器模型与大语言模型用于非结构化医疗报告中的命名实体识别
Comput Biol Med. 2025 Sep;195:110665. doi: 10.1016/j.compbiomed.2025.110665. Epub 2025 Jun 24.
4
Semi-supervised learning from small annotated data and large unlabeled data for fine-grained Participants, Intervention, Comparison, and Outcomes entity recognition.从小规模标注数据和大规模未标注数据中进行半监督学习,用于细粒度的参与者、干预措施、对照和结果实体识别。
J Am Med Inform Assoc. 2025 Mar 1;32(3):555-565. doi: 10.1093/jamia/ocae326.
5
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
6
Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.使用自然语言处理从阿尔茨海默病患者的临床记录中提取睡眠信息。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.
7
Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
8
Knowledge Graph-Enhanced Deep Learning Model (H-SYSTEM) for Hypertensive Intracerebral Hemorrhage: Model Development and Validation.用于高血压性脑出血的知识图谱增强深度学习模型(H-SYSTEM):模型开发与验证
J Med Internet Res. 2025 Jun 12;27:e66055. doi: 10.2196/66055.
9
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
10
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

引用本文的文献

1
Context Matching is not Reasoning: Assessing Generalized Evaluation of Generative Language Models in Clinical Settings.上下文匹配并非推理:评估生成式语言模型在临床环境中的广义评估
Res Sq. 2025 Aug 29:rs.3.rs-7325383. doi: 10.21203/rs.3.rs-7325383/v1.

本文引用的文献

1
Improving large language models for clinical named entity recognition via prompt engineering.通过提示工程改进临床命名实体识别的大型语言模型。
J Am Med Inform Assoc. 2024 Sep 1;31(9):1812-1820. doi: 10.1093/jamia/ocad259.
2
An extensive benchmark study on biomedical text generation and mining with ChatGPT.一项关于使用ChatGPT进行生物医学文本生成和挖掘的广泛基准研究。
Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad557.
3
ChatGPT outperforms crowd workers for text-annotation tasks.在文本注释任务中,ChatGPT的表现优于众包工作者。
Proc Natl Acad Sci U S A. 2023 Jul 25;120(30):e2305016120. doi: 10.1073/pnas.2305016120. Epub 2023 Jul 18.
4
Active neural networks to detect mentions of changes to medication treatment in social media.主动神经网络可用于检测社交媒体中药物治疗变更的提及。
J Am Med Inform Assoc. 2021 Nov 25;28(12):2551-2561. doi: 10.1093/jamia/ocab158.
5
Adversarial active learning for the identification of medical concepts and annotation inconsistency.对抗式主动学习在医学概念识别和标注不一致性中的应用。
J Biomed Inform. 2020 Aug;108:103481. doi: 10.1016/j.jbi.2020.103481. Epub 2020 Jul 18.
6
Deep learning in clinical natural language processing: a methodical review.深度学习在临床自然语言处理中的应用:系统综述。
J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.
7
2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records.2018n2c2 电子健康记录中药物不良反应和药物提取共享任务。
J Am Med Inform Assoc. 2020 Jan 1;27(1):3-12. doi: 10.1093/jamia/ocz166.
8
Cost-aware active learning for named entity recognition in clinical text.基于成本意识的临床文本命名实体识别的主动学习。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1314-1322. doi: 10.1093/jamia/ocz102.
9
Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0).从电子健康记录中提取药物、适应症和药物不良事件的自然语言处理挑战赛概述(MADE 1.0)。
Drug Saf. 2019 Jan;42(1):99-111. doi: 10.1007/s40264-018-0762-z.
10
Active learning reduces annotation time for clinical concept extraction.主动学习减少了临床概念提取的标注时间。
Int J Med Inform. 2017 Oct;106:25-31. doi: 10.1016/j.ijmedinf.2017.08.001. Epub 2017 Aug 5.

利用主动学习策略在机器辅助标注中进行临床命名实体识别:考虑标注成本和目标效果的综合分析。

Utilizing active learning strategies in machine-assisted annotation for clinical named entity recognition: a comprehensive analysis considering annotation costs and target effectiveness.

机构信息

School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, Hubei 430073, China.

Graduate School of Public Health, St Luke's International University, OMURA Susumu & Mieko Memorial St Luke's Center for Clinical Academia, Chuo-ku, Tokyo 104-0045, Japan.

出版信息

J Am Med Inform Assoc. 2024 Nov 1;31(11):2632-2640. doi: 10.1093/jamia/ocae197.

DOI:10.1093/jamia/ocae197
PMID:39081233
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11491619/
Abstract

OBJECTIVES

Active learning (AL) has rarely integrated diversity-based and uncertainty-based strategies into a dynamic sampling framework for clinical named entity recognition (NER). Machine-assisted annotation is becoming popular for creating gold-standard labels. This study investigated the effectiveness of dynamic AL strategies under simulated machine-assisted annotation scenarios for clinical NER.

MATERIALS AND METHODS

We proposed 3 new AL strategies: a diversity-based strategy (CLUSTER) based on Sentence-BERT and 2 dynamic strategies (CLC and CNBSE) capable of switching from diversity-based to uncertainty-based strategies. Using BioClinicalBERT as the foundational NER model, we conducted simulation experiments on 3 medication-related clinical NER datasets independently: i2b2 2009, n2c2 2018 (Track 2), and MADE 1.0. We compared the proposed strategies with uncertainty-based (LC and NBSE) and passive-learning (RANDOM) strategies. Performance was primarily measured by the number of edits made by the annotators to achieve a desired target effectiveness evaluated on independent test sets.

RESULTS

When aiming for 98% overall target effectiveness, on average, CLUSTER required the fewest edits. When aiming for 99% overall target effectiveness, CNBSE required 20.4% fewer edits than NBSE did. CLUSTER and RANDOM could not achieve such a high target under the pool-based simulation experiment. For high-difficulty entities, CNBSE required 22.5% fewer edits than NBSE to achieve 99% target effectiveness, whereas neither CLUSTER nor RANDOM achieved 93% target effectiveness.

DISCUSSION AND CONCLUSION

When the target effectiveness was set high, the proposed dynamic strategy CNBSE exhibited both strong learning capabilities and low annotation costs in machine-assisted annotation. CLUSTER required the fewest edits when the target effectiveness was set low.

摘要

目的

主动学习(AL)很少将基于多样性和基于不确定性的策略集成到临床命名实体识别(NER)的动态采样框架中。机器辅助标注正成为创建黄金标准标签的流行方法。本研究调查了在模拟机器辅助标注场景下,针对临床 NER 的动态 AL 策略的有效性。

材料与方法

我们提出了 3 种新的 AL 策略:一种基于 Sentence-BERT 的基于多样性的策略(CLUSTER)和 2 种能够从基于多样性切换到基于不确定性的策略的动态策略(CLC 和 CNBSE)。我们使用 BioClinicalBERT 作为基础 NER 模型,在 3 个独立的药物相关临床 NER 数据集上进行了模拟实验:i2b2 2009、n2c2 2018(第 2 轨道)和 MADE 1.0。我们将所提出的策略与基于不确定性的(LC 和 NBSE)和被动学习的(RANDOM)策略进行了比较。性能主要通过标注者为达到在独立测试集上评估的目标效果所需的编辑次数来衡量。

结果

当目标整体效果达到 98%时,CLUSTER 平均需要的编辑次数最少。当目标整体效果达到 99%时,CNBSE 比 NBSE 少需要 20.4%的编辑次数。CLUSTER 和 RANDOM 在基于池的模拟实验中无法达到如此高的目标。对于高难度实体,CNBSE 比 NBSE 少需要 22.5%的编辑次数即可达到 99%的目标效果,而 CLUSTER 和 RANDOM 都没有达到 93%的目标效果。

讨论与结论

当目标效果设置较高时,所提出的动态策略 CNBSE 在机器辅助标注中表现出较强的学习能力和较低的标注成本。当目标效果设置较低时,CLUSTER 需要的编辑次数最少。