• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Clinical text annotation - what factors are associated with the cost of time?临床文本注释——与时间成本相关的因素有哪些?
AMIA Annu Symp Proc. 2018 Dec 5;2018:1552-1560. eCollection 2018.
2
Building a comprehensive syntactic and semantic corpus of Chinese clinical texts.构建中文临床文本的综合句法和语义语料库。
J Biomed Inform. 2017 May;69:203-217. doi: 10.1016/j.jbi.2017.04.006. Epub 2017 Apr 9.
3
A study of active learning methods for named entity recognition in clinical text.临床文本中命名实体识别的主动学习方法研究
J Biomed Inform. 2015 Dec;58:11-18. doi: 10.1016/j.jbi.2015.09.010. Epub 2015 Sep 15.
4
Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing.基于Web 2.0的众包方式用于临床自然语言处理中高质量金标准的开发。
J Med Internet Res. 2013 Apr 2;15(4):e73. doi: 10.2196/jmir.2426.
5
Cost-aware active learning for named entity recognition in clinical text.基于成本意识的临床文本命名实体识别的主动学习。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1314-1322. doi: 10.1093/jamia/ocz102.
6
Assessment of disease named entity recognition on a corpus of annotated sentences.基于带注释句子语料库的疾病命名实体识别评估。
BMC Bioinformatics. 2008 Apr 11;9 Suppl 3(Suppl 3):S3. doi: 10.1186/1471-2105-9-S3-S3.
7
PhenoDEF: a corpus for annotating sentences with information of phenotype definitions in biomedical literature.PhenoDEF:一个用于在生物医学文献中注释具有表型定义信息的句子的语料库。
J Biomed Semantics. 2022 Jun 11;13(1):17. doi: 10.1186/s13326-022-00272-6.
8
An active learning-enabled annotation system for clinical named entity recognition.基于主动学习的临床命名实体识别标注系统。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):82. doi: 10.1186/s12911-017-0466-9.
9
A Five-Step Workflow to Manually Annotate Unstructured Data into Training Dataset for Natural Language Processing.将非结构化数据手动注释到自然语言处理训练数据集中的五步工作流程。
Stud Health Technol Inform. 2024 Jan 25;310:109-113. doi: 10.3233/SHTI230937.
10
DeIDNER Corpus: Annotation of Clinical Discharge Summary Notes for Named Entity Recognition Using BRAT Tool.DeIDNER 语料库:使用 BRAT 工具对命名实体识别的临床出院小结注释。
Stud Health Technol Inform. 2021 May 27;281:432-436. doi: 10.3233/SHTI210195.

引用本文的文献

1
Enhancing hepatopathy clinical trial efficiency: a secure, large language model-powered pre-screening pipeline.提高肝病临床试验效率:一个由安全的大语言模型驱动的预筛选流程。
BioData Min. 2025 Jun 14;18(1):42. doi: 10.1186/s13040-025-00458-5.
2
Enhancing Bidirectional Encoder Representations From Transformers (BERT) With Frame Semantics to Extract Clinically Relevant Information From German Mammography Reports: Algorithm Development and Validation.利用框架语义增强来自变换器的双向编码器表征(BERT)以从德国乳腺钼靶报告中提取临床相关信息:算法开发与验证
J Med Internet Res. 2025 Apr 25;27:e68427. doi: 10.2196/68427.
3
Lessons learned: Development of COVID-19 clinical staging models at a large urban research institution.经验教训:在一家大型城市研究机构开发COVID-19临床分期模型
J Clin Transl Sci. 2023 Mar 27;7(1):e113. doi: 10.1017/cts.2023.26. eCollection 2023.
4
The h-ANN Model: Comprehensive Colonoscopy Concept Compilation Using Combined Contextual Embeddings.h-ANN模型:使用组合上下文嵌入的结肠镜检查综合概念汇编。
Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2022 Feb;5:189-200. doi: 10.5220/0010903300003123.
5
TAX-Corpus: Taxonomy based Annotations for Colonoscopy Evaluation.TAX-Corpus:用于结肠镜检查评估的基于分类法的注释
Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2022 Feb;2022:162-169. doi: 10.5220/0010876100003123.

本文引用的文献

1
Clinical information extraction applications: A literature review.临床信息提取应用:文献综述。
J Biomed Inform. 2018 Jan;77:34-49. doi: 10.1016/j.jbi.2017.11.011. Epub 2017 Nov 21.
2
Mining electronic health records: towards better research applications and clinical care.挖掘电子健康记录:迈向更好的研究应用和临床护理。
Nat Rev Genet. 2012 May 2;13(6):395-405. doi: 10.1038/nrg3208.
3
Spoken Language Derived Measures for Detecting Mild Cognitive Impairment.用于检测轻度认知障碍的口语衍生测量方法。
IEEE Trans Audio Speech Lang Process. 2011 Sep 1;19(7):2081-2090. doi: 10.1109/TASL.2011.2112351.
4
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.2010 i2b2/VA 挑战赛:临床文本中的概念、断言和关系
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.
5
Extracting medication information from clinical text.从临床文本中提取药物信息。
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8. doi: 10.1136/jamia.2010.003947.
6
Extracting information from textual documents in the electronic health record: a review of recent research.从电子健康记录中的文本文件提取信息:近期研究综述
Yearb Med Inform. 2008:128-44.
7
Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports.归纳创建用于从急诊科报告中手动索引临床病症的注释模式。
J Biomed Inform. 2006 Apr;39(2):196-208. doi: 10.1016/j.jbi.2005.06.004. Epub 2005 Aug 22.
8
Linguistic complexity: locality of syntactic dependencies.语言复杂性:句法依存关系的局部性
Cognition. 1998 Aug;68(1):1-76. doi: 10.1016/s0010-0277(98)00034-1.

临床文本注释——与时间成本相关的因素有哪些?

Clinical text annotation - what factors are associated with the cost of time?

作者信息

Wei Qiang, Franklin Amy, Cohen Trevor, Xu Hua

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

出版信息

AMIA Annu Symp Proc. 2018 Dec 5;2018:1552-1560. eCollection 2018.

PMID:30815201
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6371268/
Abstract

Building high-quality annotated clinical corpora is necessary for developing statistical Natural Language Processing (NLP) models to unlock information embedded in clinical text, but it is also time consuming and expensive. Consequently, it important to identify factors that may affect annotation time, such as syntactic complexity of the text- to-be-annotated and the vagaries of individual user behavior. However, limited work has been done to understand annotation of clinical text. In this study, we aimed to investigate how factors inherent to the text affect annotation time for a named entity recognition (NER) task. We recruited 9 users to annotate a clinical corpus and recorded annotation time for each sample. Then we defined a set of factors that we hypothesized might affect annotation time, and fitted them into a linear regression model to predict annotation time. The linear regression model achieved an R of 0.611, and revealed eight time-associated factors, including characteristics of sentences, individual users, and annotation order with implications for the practice of annotation, and the development of cost models for active learning research.

摘要

构建高质量的带注释临床语料库对于开发统计自然语言处理(NLP)模型以挖掘临床文本中嵌入的信息是必要的,但这也既耗时又昂贵。因此,识别可能影响注释时间的因素很重要,比如待注释文本的句法复杂性以及个体用户行为的变幻莫测。然而,在理解临床文本注释方面所做的工作有限。在本研究中,我们旨在调查文本的内在因素如何影响命名实体识别(NER)任务的注释时间。我们招募了9名用户来注释一个临床语料库,并记录每个样本的注释时间。然后我们定义了一组我们假设可能影响注释时间的因素,并将它们纳入线性回归模型以预测注释时间。线性回归模型的R值为0.611,并揭示了八个与时间相关的因素,包括句子特征、个体用户以及注释顺序,这些因素对注释实践以及主动学习研究的成本模型开发具有启示意义。