通过病理报告实现组织库注释自动化——与黄金标准专家注释集的比较。

Automating tissue bank annotation from pathology reports - comparison to a gold standard expert annotation set.

作者信息

Liu Kaihong, Mitchell Kevin J, Chapman Wendy W, Crowley Rebecca S

机构信息

Center for Biomedical Informatics, University of Pittsburgh, PA, USA.

出版信息

AMIA Annu Symp Proc. 2005;2005:460-4.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1560700/

Abstract

Surgical pathology specimens are an important resource for medical research, particularly for cancer research. Although research studies would benefit from information derived from the surgical pathology reports, access to this information is limited by use of unstructured free-text in the reports. We have previously described a pipeline-based system for automated annotation of surgical pathology reports with UMLS concepts, which has been used to code over 450,000 surgical pathology reports at our institution. In addition to coding UMLS terms, it annotates values of several key variables, such as TNM stage and cancer grade. The object of this study was to evaluate the potential and limitations of automated extraction of these variables, by measuring the performance of the system against a true gold standard - manually encoded data entered by expert tissue annotators. We categorized and analyzed errors to determine the potential and limitations of information extraction from pathology reports for the purpose of automated biospecimen annotation.

摘要

手术病理标本是医学研究尤其是癌症研究的重要资源。尽管研究可从手术病理报告中获取的信息中受益，但报告中使用的非结构化自由文本限制了对这些信息的获取。我们之前描述了一种基于管道的系统，用于使用统一医学语言系统（UMLS）概念对手术病理报告进行自动注释，该系统已在我们机构用于对超过450,000份手术病理报告进行编码。除了对UMLS术语进行编码外，它还注释几个关键变量的值，如TNM分期和癌症分级。本研究的目的是通过将系统性能与真正的金标准——由专家组织注释员手动编码的数据进行比较，评估自动提取这些变量的潜力和局限性。我们对错误进行分类和分析，以确定从病理报告中提取信息用于自动生物标本注释的潜力和局限性。

相似文献

1

Automating tissue bank annotation from pathology reports - comparison to a gold standard expert annotation set.通过病理报告实现组织库注释自动化——与黄金标准专家注释集的比较。

AMIA Annu Symp Proc. 2005;2005:460-4.

2

Implementation and evaluation of a negation tagger in a pipeline-based system for information extract from pathology reports.基于管道系统的病理学报告信息提取中否定标记器的实现与评估。

Stud Health Technol Inform. 2004;107(Pt 1):663-7.

3

Inductive creation of an annotation schema for manually indexing clinical conditions from emergency department reports.归纳创建用于从急诊科报告中手动索引临床病症的注释模式。

J Biomed Inform. 2006 Apr;39(2):196-208. doi: 10.1016/j.jbi.2005.06.004. Epub 2005 Aug 22.

4

An automatic indexing method for medical documents.一种医学文献的自动索引方法。

Proc Annu Symp Comput Appl Med Care. 1991:1011-7.

5

Ambiguity resolution while mapping free text to the UMLS Metathesaurus.将自由文本映射到UMLS元词表时的歧义消解

Proc Annu Symp Comput Appl Med Care. 1994:240-4.

6

Automated indexing for full text information retrieval.用于全文信息检索的自动索引编制

Proc AMIA Symp. 2000:71-5.

7

Use of "off-the-shelf" information extraction algorithms in clinical informatics: A feasibility study of MetaMap annotation of Italian medical notes.临床信息学中“现成可用”信息提取算法的应用：意大利医学记录的MetaMap注释可行性研究。

J Biomed Inform. 2016 Oct;63:22-32. doi: 10.1016/j.jbi.2016.07.017. Epub 2016 Jul 18.

8

Determining prominent subdomains in medicine.确定医学领域中突出的子领域。

AMIA Annu Symp Proc. 2005;2005:46-50.

9

Evaluation of SAPHIRE: an automated approach to indexing and retrieving medical literature.对蓝宝石系统（SAPHIRE）的评估：一种医学文献索引与检索的自动化方法。

Proc Annu Symp Comput Appl Med Care. 1991:808-12.

10

Failure analysis of MetaMap Transfer (MMTx).MetaMap Transfer（MMTx）的故障分析。

Stud Health Technol Inform. 2004;107(Pt 2):763-7.

引用本文的文献

1

Improving the Annotation Process in Computational Pathology: A Pilot Study with Manual and Semi-automated Approaches on Consumer and Medical Grade Devices.改进计算病理学中的标注过程：一项关于消费级和医疗级设备的手动及半自动方法的试点研究。

J Imaging Inform Med. 2025 Apr;38(2):1112-1119. doi: 10.1007/s10278-024-01248-x. Epub 2024 Sep 4.

2

Classification of cervical biopsy free-text diagnoses through linear-classifier based natural language processing.通过基于线性分类器的自然语言处理对宫颈活检自由文本诊断进行分类。

J Pathol Inform. 2022 Jul 1;13:100123. doi: 10.1016/j.jpi.2022.100123. eCollection 2022.

3

Thyroid Ultrasound Reports: Will the Thyroid Imaging, Reporting, and Data System Improve Natural Language Processing Capture of Critical Thyroid Nodule Features?甲状腺超声报告：甲状腺成像报告和数据系统是否会改善关键甲状腺结节特征的自然语言处理捕获？

J Surg Res. 2020 Dec;256:557-563. doi: 10.1016/j.jss.2020.07.015. Epub 2020 Aug 13.

4

FasTag: Automatic text classification of unstructured medical narratives.FasTag：用于非结构化医疗叙事的自动文本分类。

PLoS One. 2020 Jun 22;15(6):e0234647. doi: 10.1371/journal.pone.0234647. eCollection 2020.

5

Facilitating accurate health provider directories using natural language processing.利用自然语言处理技术，实现医疗服务提供者名录的准确编制。

BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):80. doi: 10.1186/s12911-019-0788-x.

6

A comparative study of current Clinical Natural Language Processing systems on handling abbreviations in discharge summaries.当前临床自然语言处理系统在处理出院小结中缩写词方面的比较研究。

AMIA Annu Symp Proc. 2012;2012:997-1003. Epub 2012 Nov 3.

7

The feasibility of using natural language processing to extract clinical information from breast pathology reports.利用自然语言处理从乳腺病理报告中提取临床信息的可行性。

J Pathol Inform. 2012;3:23. doi: 10.4103/2153-3539.97788. Epub 2012 Jun 30.

8

caTIES: a grid based system for coding and retrieval of surgical pathology reports and tissue specimens in support of translational research.caTIES：一个基于网格的系统，用于编码和检索外科病理学报告和组织标本，以支持转化研究。

J Am Med Inform Assoc. 2010 May-Jun;17(3):253-64. doi: 10.1136/jamia.2009.002295.

9

Using a statistical natural language Parser augmented with the UMLS specialist lexicon to assign SNOMED CT codes to anatomic sites and pathologic diagnoses in full text pathology reports.使用一个通过统一医学语言系统（UMLS）专业词典增强的统计自然语言解析器，为全文病理报告中的解剖部位和病理诊断分配SNOMED CT编码。

AMIA Annu Symp Proc. 2009 Nov 14;2009:386-90.

10

What can natural language processing do for clinical decision support?自然语言处理能为临床决策支持做些什么？

J Biomed Inform. 2009 Oct;42(5):760-72. doi: 10.1016/j.jbi.2009.08.007. Epub 2009 Aug 13.

本文引用的文献

1

A submission model for use in the indexing, searching, and retrieval of distributed pathology case and tissue specimens.一种用于分布式病理病例和组织标本的索引、搜索及检索的提交模型。

Stud Health Technol Inform. 2004;107(Pt 2):1264-7.

2

Implementation and evaluation of a negation tagger in a pipeline-based system for information extract from pathology reports.基于管道系统的病理学报告信息提取中否定标记器的实现与评估。

Stud Health Technol Inform. 2004;107(Pt 1):663-7.

3

MEDSYNDIKATE--a natural language system for the extraction of medical information from findings reports.MEDSYNDIKATE——一个用于从检查报告中提取医学信息的自然语言系统。

Int J Med Inform. 2002 Dec 4;67(1-3):63-74. doi: 10.1016/s1386-5056(02)00053-9.

4

A simple algorithm for identifying negated findings and diseases in discharge summaries.一种用于识别出院小结中否定性检查结果和疾病的简单算法。

J Biomed Inform. 2001 Oct;34(5):301-10. doi: 10.1006/jbin.2001.1029.

5

Automatic indexing of pathology data.病理学数据的自动索引编制

J Am Soc Inf Sci. 1978 Mar;29(2):81-90. doi: 10.1002/asi.4630290207.

6

A general natural-language text processor for clinical radiology.一种用于临床放射学的通用自然语言文本处理器。

J Am Med Inform Assoc. 1994 Mar-Apr;1(2):161-74. doi: 10.1136/jamia.1994.95236146.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验