• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于主动学习的临床命名实体识别标注系统。

An active learning-enabled annotation system for clinical named entity recognition.

机构信息

Pieces Technologies Inc, Dallas, TX, USA.

Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA.

出版信息

BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):82. doi: 10.1186/s12911-017-0466-9.

DOI:10.1186/s12911-017-0466-9
PMID:28699546
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5506567/
Abstract

BACKGROUND

Active learning (AL) has shown the promising potential to minimize the annotation cost while maximizing the performance in building statistical natural language processing (NLP) models. However, very few studies have investigated AL in a real-life setting in medical domain.

METHODS

In this study, we developed the first AL-enabled annotation system for clinical named entity recognition (NER) with a novel AL algorithm. Besides the simulation study to evaluate the novel AL algorithm, we further conducted user studies with two nurses using this system to assess the performance of AL in real world annotation processes for building clinical NER models.

RESULTS

The simulation results show that the novel AL algorithm outperformed traditional AL algorithm and random sampling. However, the user study tells a different story that AL methods did not always perform better than random sampling for different users.

CONCLUSIONS

We found that the increased information content of actively selected sentences is strongly offset by the increased time required to annotate them. Moreover, the annotation time was not considered in the querying algorithms. Our future work includes developing better AL algorithms with the estimation of annotation time and evaluating the system with larger number of users.

摘要

背景

主动学习(AL)已显示出在构建统计自然语言处理(NLP)模型时具有减少注释成本和最大化性能的巨大潜力。然而,很少有研究在医学领域的实际环境中研究 AL。

方法

在这项研究中,我们开发了第一个具有新颖 AL 算法的用于临床命名实体识别(NER)的 AL 启用注释系统。除了对新型 AL 算法进行模拟研究以评估其性能外,我们还进一步让两名护士使用该系统进行用户研究,以评估 AL 在真实世界的注释过程中构建临床 NER 模型的性能。

结果

模拟结果表明,新型 AL 算法优于传统的 AL 算法和随机抽样。然而,用户研究告诉我们一个不同的故事,即对于不同的用户,AL 方法并不总是比随机抽样表现更好。

结论

我们发现,主动选择的句子的信息量增加被注释所需的时间增加所抵消。此外,查询算法中没有考虑注释时间。我们未来的工作包括开发更好的 AL 算法,同时考虑注释时间,并使用更多的用户来评估该系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bfa/5506567/39bdd976038c/12911_2017_466_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bfa/5506567/e5afff814ec5/12911_2017_466_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bfa/5506567/e93211799494/12911_2017_466_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bfa/5506567/39bdd976038c/12911_2017_466_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bfa/5506567/e5afff814ec5/12911_2017_466_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bfa/5506567/e93211799494/12911_2017_466_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bfa/5506567/39bdd976038c/12911_2017_466_Fig3_HTML.jpg

相似文献

1
An active learning-enabled annotation system for clinical named entity recognition.基于主动学习的临床命名实体识别标注系统。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):82. doi: 10.1186/s12911-017-0466-9.
2
A study of active learning methods for named entity recognition in clinical text.临床文本中命名实体识别的主动学习方法研究
J Biomed Inform. 2015 Dec;58:11-18. doi: 10.1016/j.jbi.2015.09.010. Epub 2015 Sep 15.
3
Cost-aware active learning for named entity recognition in clinical text.基于成本意识的临床文本命名实体识别的主动学习。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1314-1322. doi: 10.1093/jamia/ocz102.
4
Clinical text annotation - what factors are associated with the cost of time?临床文本注释——与时间成本相关的因素有哪些?
AMIA Annu Symp Proc. 2018 Dec 5;2018:1552-1560. eCollection 2018.
5
Utilizing active learning strategies in machine-assisted annotation for clinical named entity recognition: a comprehensive analysis considering annotation costs and target effectiveness.利用主动学习策略在机器辅助标注中进行临床命名实体识别:考虑标注成本和目标效果的综合分析。
J Am Med Inform Assoc. 2024 Nov 1;31(11):2632-2640. doi: 10.1093/jamia/ocae197.
6
Active learning reduces annotation time for clinical concept extraction.主动学习减少了临床概念提取的标注时间。
Int J Med Inform. 2017 Oct;106:25-31. doi: 10.1016/j.ijmedinf.2017.08.001. Epub 2017 Aug 5.
7
Active learning for ontological event extraction incorporating named entity recognition and unknown word handling.结合命名实体识别和未知词处理的本体事件抽取的主动学习
J Biomed Semantics. 2016 Apr 27;7:22. doi: 10.1186/s13326-016-0059-z. eCollection 2016.
8
Entity recognition from clinical texts via recurrent neural network.基于循环神经网络的临床文本实体识别。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67. doi: 10.1186/s12911-017-0468-7.
9
Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements.评估预标注对标注速度和潜在偏差的影响:临床试验公告中临床命名实体识别的自然语言处理金标准开发。
J Am Med Inform Assoc. 2014 May-Jun;21(3):406-13. doi: 10.1136/amiajnl-2013-001837. Epub 2013 Sep 3.
10
Applying active learning to assertion classification of concepts in clinical text.将主动学习应用于临床文本中概念的断言分类。
J Biomed Inform. 2012 Apr;45(2):265-72. doi: 10.1016/j.jbi.2011.11.003. Epub 2011 Nov 22.

引用本文的文献

1
Utilizing active learning strategies in machine-assisted annotation for clinical named entity recognition: a comprehensive analysis considering annotation costs and target effectiveness.利用主动学习策略在机器辅助标注中进行临床命名实体识别:考虑标注成本和目标效果的综合分析。
J Am Med Inform Assoc. 2024 Nov 1;31(11):2632-2640. doi: 10.1093/jamia/ocae197.
2
Sample Size Considerations for Fine-Tuning Large Language Models for Named Entity Recognition Tasks: Methodological Study.用于命名实体识别任务的大语言模型微调的样本量考量:方法学研究
JMIR AI. 2024 May 16;3:e52095. doi: 10.2196/52095.
3
Social and Behavioral Determinants of Health in the Era of Artificial Intelligence with Electronic Health Records: A Scoping Review.

本文引用的文献

1
A study of active learning methods for named entity recognition in clinical text.临床文本中命名实体识别的主动学习方法研究
J Biomed Inform. 2015 Dec;58:11-18. doi: 10.1016/j.jbi.2015.09.010. Epub 2015 Sep 15.
2
Applying active learning to high-throughput phenotyping algorithms for electronic health records data.将主动学习应用于电子健康记录数据的高通量表型算法。
J Am Med Inform Assoc. 2013 Dec;20(e2):e253-9. doi: 10.1136/amiajnl-2013-001945. Epub 2013 Jul 13.
3
Applying active learning to supervised word sense disambiguation in MEDLINE.
人工智能与电子健康记录时代健康的社会和行为决定因素:一项范围综述
Health Data Sci. 2021 Aug 24;2021:9759016. doi: 10.34133/2021/9759016. eCollection 2021.
4
A study of deep active learning methods to reduce labelling efforts in biomedical relation extraction.一种用于减少生物医学关系抽取中标记工作的深度主动学习方法研究。
PLoS One. 2023 Dec 15;18(12):e0292356. doi: 10.1371/journal.pone.0292356. eCollection 2023.
5
A Systematic Approach to Configuring MetaMap for Optimal Performance.系统方法配置 MetaMap 以实现最佳性能。
Methods Inf Med. 2022 Dec;61(S 02):e51-e63. doi: 10.1055/a-1862-0421. Epub 2022 May 25.
6
Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction.使用主动学习对健康的社会决定因素进行标注,并使用神经事件提取对决定因素进行特征描述。
J Biomed Inform. 2021 Jan;113:103631. doi: 10.1016/j.jbi.2020.103631. Epub 2020 Dec 5.
7
Clinical concept extraction: A methodology review.临床概念提取:方法学综述。
J Biomed Inform. 2020 Sep;109:103526. doi: 10.1016/j.jbi.2020.103526. Epub 2020 Aug 6.
8
Cost-aware active learning for named entity recognition in clinical text.基于成本意识的临床文本命名实体识别的主动学习。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1314-1322. doi: 10.1093/jamia/ocz102.
9
Cost-sensitive Active Learning for Phenotyping of Electronic Health Records.用于电子健康记录表型分析的成本敏感主动学习
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:829-838. eCollection 2019.
10
Evaluating active learning methods for annotating semantic predications.评估用于标注语义谓词的主动学习方法。
JAMIA Open. 2018 Oct;1(2):275-282. doi: 10.1093/jamiaopen/ooy021. Epub 2018 Jun 27.
将主动学习应用于 MEDLINE 中的监督词义消歧。
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):1001-6. doi: 10.1136/amiajnl-2012-001244. Epub 2013 Jan 30.
4
Active learning for clinical text classification: is it better than random sampling?主动学习在临床文本分类中的应用:它比随机抽样更好吗?
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):809-16. doi: 10.1136/amiajnl-2011-000648. Epub 2012 Jun 15.
5
Applying active learning to assertion classification of concepts in clinical text.将主动学习应用于临床文本中概念的断言分类。
J Biomed Inform. 2012 Apr;45(2):265-72. doi: 10.1016/j.jbi.2011.11.003. Epub 2011 Nov 22.
6
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.2010 i2b2/VA 挑战赛:临床文本中的概念、断言和关系
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.
7
A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.基于机器学习的方法从出院小结中提取临床实体及其断言的研究。
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):601-6. doi: 10.1136/amiajnl-2011-000163. Epub 2011 Apr 20.
8
Extracting medication information from clinical text.从临床文本中提取药物信息。
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8. doi: 10.1136/jamia.2010.003947.
9
Clustering by passing messages between data points.通过在数据点之间传递信息进行聚类。
Science. 2007 Feb 16;315(5814):972-6. doi: 10.1126/science.1136800. Epub 2007 Jan 11.