文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于Web 2.0的众包方式用于临床自然语言处理中高质量金标准的开发。

Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.

作者信息

Zhai Haijun, Lingren Todd, Deleger Louise, Li Qi, Kaiser Megan, Stoutenborough Laura, Solti Imre

机构信息

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA.

出版信息

J Med Internet Res. 2013 Apr 2;15(4):e73. doi: 10.2196/jmir.2426.


DOI:10.2196/jmir.2426
PMID:23548263
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3636329/
Abstract

BACKGROUND: A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora. OBJECTIVE: Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora. METHODS: To build the gold standard for evaluating the crowdsourcing workers' performance, 1042 clinical trial announcements (CTAs) from the ClinicalTrials.gov website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowd's work and tested the statistical significance (P<.001, chi-square test) to detect differences between the crowdsourced and traditionally-developed annotations. RESULTS: The agreement between the crowd's annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names; 0.73, medication types), (2) correction of previous annotations (0.90, medication names; 0.76, medication types), and excellent for (3) linking medications with their attributes (0.96). Simple voting provided the best judgment aggregation approach. There was no statistically significant difference between the crowd and traditionally-generated corpora. Our results showed a 27.9% improvement over previously reported results on medication named entity annotation task. CONCLUSIONS: This study offers three contributions. First, we proved that crowdsourcing is a feasible, inexpensive, fast, and practical approach to collect high-quality annotations for clinical text (when protected health information was excluded). We believe that well-designed user interfaces and rigorous quality control strategy for entity annotation and linking were critical to the success of this work. Second, as a further contribution to the Internet-based crowdsourcing field, we will publicly release the JavaScript and CrowdFlower Markup Language infrastructure code that is necessary to utilize CrowdFlower's quality control and crowdsourcing interfaces for named entity annotations. Finally, to spur future research, we will release the CTA annotations that were generated by traditional and crowdsourced approaches.

摘要

背景:高质量的金标准对于基于监督式机器学习的临床自然语言处理(NLP)系统至关重要。在临床NLP项目中,传统上由专家注释者创建金标准。然而,传统注释成本高且耗时。为降低注释成本,一般NLP项目已转向基于Web 2.0技术的众包,即将较小的子任务提交到互联网上由工人组成的协调市场。众包领域已开展了许多研究,但只有少数关注一般NLP领域的任务,而在生物医学领域的研究更是寥寥无几,且通常基于非常小的试点样本量。此外,与众包生物医学NLP语料库相比,传统开发的金标准的质量也并不突出。先前关于医学命名实体注释任务的报告结果显示,众包语料库与传统开发的语料库之间的F值一致性为0.68。 目的:基于一般众包研究的先前工作,本研究调查了众包在临床NLP领域的可用性,特别强调在众包语料库与传统开发的语料库之间实现高度一致性。 方法:为建立评估众包工人表现的金标准,从ClinicalTrials.gov网站随机选择了1042条临床试验公告(CTA),并对药物名称、药物类型和相关属性进行了双重注释。在实验中,我们使用了CrowdFlower,这是一个基于亚马逊Mechanical Turk的众包平台。我们计算了敏感度、精确率和F值来评估众包工作的质量,并通过卡方检验(P<0.001)测试统计显著性,以检测众包注释与传统开发注释之间的差异。 结果:众包注释与传统生成的语料库之间具有较高的一致性,具体表现为:(1)注释(药物名称的F值为0.87;药物类型为0.73),(2)对先前注释的修正(药物名称为0.90;药物类型为0.76),以及(3)药物与其属性的关联(0.96)表现出色。简单投票提供了最佳的判断汇总方法。众包语料库与传统生成的语料库之间没有统计学上的显著差异。我们的结果显示,与先前关于药物命名实体注释任务的报告结果相比,有27.9%的提升。 结论:本研究有三点贡献。首先,我们证明了众包是一种可行、廉价、快速且实用的方法,可用于为临床文本收集高质量注释(排除受保护的健康信息时)。我们认为,精心设计的用户界面以及针对实体注释和链接的严格质量控制策略对这项工作的成功至关重要。其次,作为对基于互联网众包领域的进一步贡献,我们将公开发布利用CrowdFlower的质量控制和众包接口进行命名实体注释所需的JavaScript和CrowdFlower标记语言基础设施代码。最后,为推动未来研究,我们将发布通过传统方法和众包方法生成的CTA注释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/781e7db42bb8/jmir_v15i4e73_fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/95aeb4ebdec9/jmir_v15i4e73_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/a78eff8eedad/jmir_v15i4e73_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/df0ffb8a4e03/jmir_v15i4e73_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/c732874cd222/jmir_v15i4e73_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/fc972ae70fc6/jmir_v15i4e73_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/3704e75b5ab2/jmir_v15i4e73_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/781e7db42bb8/jmir_v15i4e73_fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/95aeb4ebdec9/jmir_v15i4e73_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/a78eff8eedad/jmir_v15i4e73_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/df0ffb8a4e03/jmir_v15i4e73_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/c732874cd222/jmir_v15i4e73_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/fc972ae70fc6/jmir_v15i4e73_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/3704e75b5ab2/jmir_v15i4e73_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/781e7db42bb8/jmir_v15i4e73_fig7.jpg

相似文献

[1]
Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.

J Med Internet Res. 2013-4-2

[2]
Microtask crowdsourcing for disease mention annotation in PubMed abstracts.

Pac Symp Biocomput. 2015

[3]
Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd.

Pac Symp Biocomput. 2015

[4]
User-centered design of a web-based crowdsourcing-integrated semantic text annotation tool for building a mental health knowledge base.

J Biomed Inform. 2020-10

[5]
Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements.

J Am Med Inform Assoc. 2013-9-3

[6]
Validity of Online Screening for Autism: Crowdsourcing Study Comparing Paid and Unpaid Diagnostic Tasks.

J Med Internet Res. 2019-5-23

[7]
Crowdsourcing Twitter annotations to identify first-hand experiences of prescription drug use.

J Biomed Inform. 2015-12

[8]
Boosting drug named entity recognition using an aggregate classifier.

Artif Intell Med. 2015-10

[9]
SemClinBr - a multi-institutional and multi-specialty semantically annotated corpus for Portuguese clinical NLP tasks.

J Biomed Semantics. 2022-5-8

[10]
A hybrid approach toward biomedical relation extraction training corpora: combining distant supervision with crowdsourcing.

Database (Oxford). 2020-12-1

引用本文的文献

[1]
Opportunities and challenges of text mining in aterials research.

iScience. 2021-2-6

[2]
Automated assessment of biological database assertions using the scientific literature.

BMC Bioinformatics. 2019-4-29

[3]
Improving Electronic Health Record Note Comprehension With NoteAid: Randomized Trial of Electronic Health Record Note Comprehension Interventions With Crowdsourced Workers.

J Med Internet Res. 2019-1-16

[4]
OC-2-KB: integrating crowdsourcing into an obesity and cancer knowledge base curation system.

BMC Med Inform Decis Mak. 2018-7-23

[5]
Mapping of Crowdsourcing in Health: Systematic Review.

J Med Internet Res. 2018-5-15

[6]
ComprehENotes, an Instrument to Assess Patient Reading Comprehension of Electronic Health Record Notes: Development and Validation.

J Med Internet Res. 2018-4-25

[7]
Applications of crowdsourcing in health: an overview.

J Glob Health. 2018-6

[8]
The application of crowdsourcing approaches to cancer research: a systematic review.

Cancer Med. 2017-11

[9]
Crowdsourcing and curation: perspectives from biology and natural language processing.

Database (Oxford). 2016-8-7

[10]
Crowdsourcing the Measurement of Interstate Conflict.

PLoS One. 2016-6-16

本文引用的文献

[1]
Towards comprehensive syntactic and semantic annotations of the clinical narrative.

J Am Med Inform Assoc. 2013-1-25

[2]
Building gold standard corpora for medical natural language processing tasks.

AMIA Annu Symp Proc. 2012

[3]
A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction.

J Am Med Inform Assoc. 2012-12-25

[4]
Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears.

J Med Internet Res. 2012-11-29

[5]
Using crowdsourcing technology for testing multilingual public health promotion materials.

J Med Internet Res. 2012-6-4

[6]
Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions.

J Am Med Inform Assoc. 2011

[7]
Using Amazon's Mechanical Turk for Annotating Medical Named Entities.

AMIA Annu Symp Proc. 2010

[8]
Leveraging crowdsourcing to facilitate the discovery of new medicines.

Sci Transl Med. 2011-6-22

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索