• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于Web 2.0的众包方式用于临床自然语言处理中高质量金标准的开发。

Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.

作者信息

Zhai Haijun, Lingren Todd, Deleger Louise, Li Qi, Kaiser Megan, Stoutenborough Laura, Solti Imre

机构信息

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA.

出版信息

J Med Internet Res. 2013 Apr 2;15(4):e73. doi: 10.2196/jmir.2426.

DOI:10.2196/jmir.2426
PMID:23548263
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3636329/
Abstract

BACKGROUND

A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora.

OBJECTIVE

Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora.

METHODS

To build the gold standard for evaluating the crowdsourcing workers' performance, 1042 clinical trial announcements (CTAs) from the ClinicalTrials.gov website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowd's work and tested the statistical significance (P<.001, chi-square test) to detect differences between the crowdsourced and traditionally-developed annotations.

RESULTS

The agreement between the crowd's annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names; 0.73, medication types), (2) correction of previous annotations (0.90, medication names; 0.76, medication types), and excellent for (3) linking medications with their attributes (0.96). Simple voting provided the best judgment aggregation approach. There was no statistically significant difference between the crowd and traditionally-generated corpora. Our results showed a 27.9% improvement over previously reported results on medication named entity annotation task.

CONCLUSIONS

This study offers three contributions. First, we proved that crowdsourcing is a feasible, inexpensive, fast, and practical approach to collect high-quality annotations for clinical text (when protected health information was excluded). We believe that well-designed user interfaces and rigorous quality control strategy for entity annotation and linking were critical to the success of this work. Second, as a further contribution to the Internet-based crowdsourcing field, we will publicly release the JavaScript and CrowdFlower Markup Language infrastructure code that is necessary to utilize CrowdFlower's quality control and crowdsourcing interfaces for named entity annotations. Finally, to spur future research, we will release the CTA annotations that were generated by traditional and crowdsourced approaches.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/781e7db42bb8/jmir_v15i4e73_fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/95aeb4ebdec9/jmir_v15i4e73_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/a78eff8eedad/jmir_v15i4e73_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/df0ffb8a4e03/jmir_v15i4e73_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/c732874cd222/jmir_v15i4e73_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/fc972ae70fc6/jmir_v15i4e73_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/3704e75b5ab2/jmir_v15i4e73_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/781e7db42bb8/jmir_v15i4e73_fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/95aeb4ebdec9/jmir_v15i4e73_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/a78eff8eedad/jmir_v15i4e73_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/df0ffb8a4e03/jmir_v15i4e73_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/c732874cd222/jmir_v15i4e73_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/fc972ae70fc6/jmir_v15i4e73_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/3704e75b5ab2/jmir_v15i4e73_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a44/3636329/781e7db42bb8/jmir_v15i4e73_fig7.jpg
摘要

背景

高质量的金标准对于基于监督式机器学习的临床自然语言处理(NLP)系统至关重要。在临床NLP项目中,传统上由专家注释者创建金标准。然而,传统注释成本高且耗时。为降低注释成本,一般NLP项目已转向基于Web 2.0技术的众包,即将较小的子任务提交到互联网上由工人组成的协调市场。众包领域已开展了许多研究,但只有少数关注一般NLP领域的任务,而在生物医学领域的研究更是寥寥无几,且通常基于非常小的试点样本量。此外,与众包生物医学NLP语料库相比,传统开发的金标准的质量也并不突出。先前关于医学命名实体注释任务的报告结果显示,众包语料库与传统开发的语料库之间的F值一致性为0.68。

目的

基于一般众包研究的先前工作,本研究调查了众包在临床NLP领域的可用性,特别强调在众包语料库与传统开发的语料库之间实现高度一致性。

方法

为建立评估众包工人表现的金标准,从ClinicalTrials.gov网站随机选择了1042条临床试验公告(CTA),并对药物名称、药物类型和相关属性进行了双重注释。在实验中,我们使用了CrowdFlower,这是一个基于亚马逊Mechanical Turk的众包平台。我们计算了敏感度、精确率和F值来评估众包工作的质量,并通过卡方检验(P<0.001)测试统计显著性,以检测众包注释与传统开发注释之间的差异。

结果

众包注释与传统生成的语料库之间具有较高的一致性,具体表现为:(1)注释(药物名称的F值为0.87;药物类型为0.73),(2)对先前注释的修正(药物名称为0.90;药物类型为0.76),以及(3)药物与其属性的关联(0.96)表现出色。简单投票提供了最佳的判断汇总方法。众包语料库与传统生成的语料库之间没有统计学上的显著差异。我们的结果显示,与先前关于药物命名实体注释任务的报告结果相比,有27.9%的提升。

结论

本研究有三点贡献。首先,我们证明了众包是一种可行、廉价、快速且实用的方法,可用于为临床文本收集高质量注释(排除受保护的健康信息时)。我们认为,精心设计的用户界面以及针对实体注释和链接的严格质量控制策略对这项工作的成功至关重要。其次,作为对基于互联网众包领域的进一步贡献,我们将公开发布利用CrowdFlower的质量控制和众包接口进行命名实体注释所需的JavaScript和CrowdFlower标记语言基础设施代码。最后,为推动未来研究,我们将发布通过传统方法和众包方法生成的CTA注释。

相似文献

1
Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.基于Web 2.0的众包方式用于临床自然语言处理中高质量金标准的开发。
J Med Internet Res. 2013 Apr 2;15(4):e73. doi: 10.2196/jmir.2426.
2
Microtask crowdsourcing for disease mention annotation in PubMed abstracts.用于在PubMed摘要中进行疾病提及标注的微任务众包。
Pac Symp Biocomput. 2015:282-93.
3
Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd.用于计算病理学中细胞核检测与分割的众包图像标注:评估专家、自动化方法及大众标注。
Pac Symp Biocomput. 2015:294-305. doi: 10.1142/9789814644730_0029.
4
User-centered design of a web-based crowdsourcing-integrated semantic text annotation tool for building a mental health knowledge base.用于构建心理健康知识库的基于网络众包集成语义文本注释工具的以用户为中心的设计。
J Biomed Inform. 2020 Oct;110:103571. doi: 10.1016/j.jbi.2020.103571. Epub 2020 Sep 19.
5
Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements.评估预标注对标注速度和潜在偏差的影响:临床试验公告中临床命名实体识别的自然语言处理金标准开发。
J Am Med Inform Assoc. 2014 May-Jun;21(3):406-13. doi: 10.1136/amiajnl-2013-001837. Epub 2013 Sep 3.
6
Validity of Online Screening for Autism: Crowdsourcing Study Comparing Paid and Unpaid Diagnostic Tasks.自闭症在线筛查的有效性:比较付费和免费诊断任务的众包研究
J Med Internet Res. 2019 May 23;21(5):e13668. doi: 10.2196/13668.
7
Crowdsourcing Twitter annotations to identify first-hand experiences of prescription drug use.众包推特注释以识别处方药使用的第一手经验。
J Biomed Inform. 2015 Dec;58:280-287. doi: 10.1016/j.jbi.2015.11.004. Epub 2015 Nov 7.
8
Boosting drug named entity recognition using an aggregate classifier.使用聚合分类器提升药物命名实体识别
Artif Intell Med. 2015 Oct;65(2):145-53. doi: 10.1016/j.artmed.2015.05.007. Epub 2015 Jun 17.
9
SemClinBr - a multi-institutional and multi-specialty semantically annotated corpus for Portuguese clinical NLP tasks.SemClinBr - 一个用于葡萄牙语临床自然语言处理任务的多机构和多专业的语义注释语料库。
J Biomed Semantics. 2022 May 8;13(1):13. doi: 10.1186/s13326-022-00269-1.
10
A hybrid approach toward biomedical relation extraction training corpora: combining distant supervision with crowdsourcing.一种生物医学关系抽取训练语料库的混合方法:结合远程监督和众包。
Database (Oxford). 2020 Dec 1;2020. doi: 10.1093/database/baaa104.

引用本文的文献

1
Opportunities and challenges of text mining in aterials research.材料研究中文本挖掘的机遇与挑战。 (注:原英文中“aterials”有误,正确应为“materials”)
iScience. 2021 Feb 6;24(3):102155. doi: 10.1016/j.isci.2021.102155. eCollection 2021 Mar 19.
2
Automated assessment of biological database assertions using the scientific literature.利用科学文献自动评估生物数据库断言。
BMC Bioinformatics. 2019 Apr 29;20(1):216. doi: 10.1186/s12859-019-2801-x.
3
Improving Electronic Health Record Note Comprehension With NoteAid: Randomized Trial of Electronic Health Record Note Comprehension Interventions With Crowdsourced Workers.

本文引用的文献

1
Towards comprehensive syntactic and semantic annotations of the clinical narrative.朝着临床叙述的全面句法和语义标注努力。
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):922-30. doi: 10.1136/amiajnl-2012-001317. Epub 2013 Jan 25.
2
Building gold standard corpora for medical natural language processing tasks.为医学自然语言处理任务构建金标准语料库。
AMIA Annu Symp Proc. 2012;2012:144-53. Epub 2012 Nov 3.
3
A sequence labeling approach to link medications and their attributes in clinical notes and clinical trial announcements for information extraction.
使用NoteAid提高电子健康记录笔记的理解能力:针对众包工作者的电子健康记录笔记理解干预措施的随机试验
J Med Internet Res. 2019 Jan 16;21(1):e10793. doi: 10.2196/10793.
4
OC-2-KB: integrating crowdsourcing into an obesity and cancer knowledge base curation system.OC-2-KB:将众包集成到肥胖和癌症知识库策管系统中。
BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):55. doi: 10.1186/s12911-018-0635-5.
5
Mapping of Crowdsourcing in Health: Systematic Review.健康领域众包的映射:系统综述
J Med Internet Res. 2018 May 15;20(5):e187. doi: 10.2196/jmir.9330.
6
ComprehENotes, an Instrument to Assess Patient Reading Comprehension of Electronic Health Record Notes: Development and Validation.ComprehENotes,一种评估患者对电子健康记录笔记阅读理解能力的工具:开发与验证
J Med Internet Res. 2018 Apr 25;20(4):e139. doi: 10.2196/jmir.9380.
7
Applications of crowdsourcing in health: an overview.众包在健康领域的应用:综述。
J Glob Health. 2018 Jun;8(1):010502. doi: 10.7189/jogh.08.010502.
8
The application of crowdsourcing approaches to cancer research: a systematic review.众包方法在癌症研究中的应用:一项系统综述。
Cancer Med. 2017 Nov;6(11):2595-2605. doi: 10.1002/cam4.1165. Epub 2017 Sep 29.
9
Crowdsourcing and curation: perspectives from biology and natural language processing.众包与精选:来自生物学和自然语言处理的视角
Database (Oxford). 2016 Aug 7;2016. doi: 10.1093/database/baw115. Print 2016.
10
Crowdsourcing the Measurement of Interstate Conflict.众包方式测量州际冲突
PLoS One. 2016 Jun 16;11(6):e0156527. doi: 10.1371/journal.pone.0156527. eCollection 2016.
一种序列标注方法,用于链接临床记录和临床试验公告中的药物及其属性,以进行信息提取。
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):915-21. doi: 10.1136/amiajnl-2012-001487. Epub 2012 Dec 25.
4
Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears.众包疟原虫定量:一款用于分析感染厚血涂片图像的在线游戏。
J Med Internet Res. 2012 Nov 29;14(6):e167. doi: 10.2196/jmir.2338.
5
Using crowdsourcing technology for testing multilingual public health promotion materials.利用众包技术测试多语言公共卫生宣传材料。
J Med Internet Res. 2012 Jun 4;14(3):e79. doi: 10.2196/jmir.2063.
6
Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions.克服临床文本自然语言处理的障碍:共享任务的作用及对其他创造性解决方案的需求。
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):540-3. doi: 10.1136/amiajnl-2011-000465.
7
Using Amazon's Mechanical Turk for Annotating Medical Named Entities.利用亚马逊的土耳其机器人进行医学命名实体标注。
AMIA Annu Symp Proc. 2010;2010:1316.
8
Leveraging crowdsourcing to facilitate the discovery of new medicines.利用众包促进新药发现。
Sci Transl Med. 2011 Jun 22;3(88):88mr1. doi: 10.1126/scitranslmed.3002678.