• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Leveraging Natural Language Processing to Extract Features of Colorectal Polyps From Pathology Reports for Epidemiologic Study.利用自然语言处理技术从病理学报告中提取结直肠息肉特征用于流行病学研究。
JCO Clin Cancer Inform. 2023 Jan;7:e2200131. doi: 10.1200/CCI.22.00131.
2
Natural Language Processing Accurately Calculates Adenoma and Sessile Serrated Polyp Detection Rates.自然语言处理准确计算腺瘤和无蒂锯齿状息肉的检出率。
Dig Dis Sci. 2018 Jul;63(7):1794-1800. doi: 10.1007/s10620-018-5078-4. Epub 2018 Apr 26.
3
Development of a Large Colonoscopy-Based Longitudinal Cohort for Integrated Research of Colorectal Cancer: Partners Colonoscopy Cohort.基于结肠镜的大型纵向队列研究大肠癌的综合研究:合作伙伴结肠镜队列。
Dig Dis Sci. 2022 Feb;67(2):473-480. doi: 10.1007/s10620-021-06882-x. Epub 2021 Feb 16.
4
Synchronous occurrence of different polyp types during colonoscopy.结肠镜检查中不同息肉类型的同时发生。
Aliment Pharmacol Ther. 2022 Sep;56(5):777-782. doi: 10.1111/apt.17075. Epub 2022 Jun 23.
5
Multi-center colonoscopy quality measurement utilizing natural language processing.利用自然语言处理进行多中心结肠镜检查质量评估
Am J Gastroenterol. 2015 Apr;110(4):543-52. doi: 10.1038/ajg.2015.51. Epub 2015 Mar 10.
6
Prevalence of Adenomas on Surveillance Colonoscopies for Patients with a History of Colonic Polyps of Unknown Histology.有未知组织学类型结肠息肉史患者的监测结肠镜检查中腺瘤的患病率。
Dig Dis Sci. 2022 Jul;67(7):3239-3243. doi: 10.1007/s10620-021-07108-w. Epub 2021 Jun 22.
7
Clinicopathologic Features of Colorectal Polyps in Shahid Beheshti University of Medical Sciences (SBMU).设拉子医科大学(SBMU)结直肠息肉的临床病理特征
Asian Pac J Cancer Prev. 2019 Jun 1;20(6):1773-1780. doi: 10.31557/APJCP.2019.20.6.1773.
8
Natural Language Processing for Assessing Quality Indicators in Free-Text Colonoscopy and Pathology Reports: Development and Usability Study.用于评估自由文本结肠镜检查和病理报告质量指标的自然语言处理:开发与可用性研究
JMIR Med Inform. 2022 Apr 15;10(4):e35257. doi: 10.2196/35257.
9
Long-term risk of colorectal cancer in individuals with serrated polyps.锯齿状息肉患者的结直肠癌长期风险。
Gut. 2015 Jun;64(6):929-36. doi: 10.1136/gutjnl-2014-307793. Epub 2014 Nov 16.
10
Increased incidence of colorectal adenomas in follow-up evaluation of patients with newly diagnosed hyperplastic polyps.新诊断增生性息肉患者随访评估中结直肠腺瘤发病率增加。
Surg Endosc. 2001 Jul;15(7):646-8. doi: 10.1007/s004640000389. Epub 2001 May 14.

引用本文的文献

1
Performance of Natural Language Processing for Information Extraction From Electronic Health Records Within Cancer: Systematic Review.自然语言处理在癌症电子健康记录信息提取中的性能:系统评价
JMIR Med Inform. 2025 Sep 12;13:e68707. doi: 10.2196/68707.
2
Enhancing and Not Replacing Clinical Expertise: Improving Named-Entity Recognition in Colonoscopy Reports Through Mixed Real-Synthetic Training Sources.增强而非取代临床专业知识:通过混合真实与合成训练源提高结肠镜检查报告中的命名实体识别
J Pers Med. 2025 Jul 30;15(8):334. doi: 10.3390/jpm15080334.
3
Employing Consensus-Based Reasoning with Locally Deployed LLMs for Enabling Structured Data Extraction from Surgical Pathology Reports.运用基于共识的推理与本地部署的语言模型从外科病理报告中提取结构化数据。
medRxiv. 2025 Apr 29:2025.04.22.25326217. doi: 10.1101/2025.04.22.25326217.
4
Large language models for extracting histopathologic diagnoses of colorectal cancer and dysplasia from electronic health records.用于从电子健康记录中提取结直肠癌和发育异常组织病理学诊断的大语言模型
medRxiv. 2025 Apr 22:2024.11.27.24318083. doi: 10.1101/2024.11.27.24318083.
5
Emerging applications of NLP and large language models in gastroenterology and hepatology: a systematic review.自然语言处理和大语言模型在胃肠病学和肝病学中的新兴应用:一项系统综述
Front Med (Lausanne). 2025 Jan 22;11:1512824. doi: 10.3389/fmed.2024.1512824. eCollection 2024.

本文引用的文献

1
Natural Language Processing for Assessing Quality Indicators in Free-Text Colonoscopy and Pathology Reports: Development and Usability Study.用于评估自由文本结肠镜检查和病理报告质量指标的自然语言处理:开发与可用性研究
JMIR Med Inform. 2022 Apr 15;10(4):e35257. doi: 10.2196/35257.
2
The h-ANN Model: Comprehensive Colonoscopy Concept Compilation Using Combined Contextual Embeddings.h-ANN模型:使用组合上下文嵌入的结肠镜检查综合概念汇编。
Biomed Eng Syst Technol Int Jt Conf BIOSTEC Revis Sel Pap. 2022 Feb;5:189-200. doi: 10.5220/0010903300003123.
3
Development of a Large Colonoscopy-Based Longitudinal Cohort for Integrated Research of Colorectal Cancer: Partners Colonoscopy Cohort.基于结肠镜的大型纵向队列研究大肠癌的综合研究:合作伙伴结肠镜队列。
Dig Dis Sci. 2022 Feb;67(2):473-480. doi: 10.1007/s10620-021-06882-x. Epub 2021 Feb 16.
4
Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.《全球癌症统计数据 2020:全球 185 个国家和地区 36 种癌症的发病率和死亡率估计》。
CA Cancer J Clin. 2021 May;71(3):209-249. doi: 10.3322/caac.21660. Epub 2021 Feb 4.
5
Rising incidence of early-onset colorectal cancer - a call to action.结直肠癌发病年轻化——行动的召唤。
Nat Rev Clin Oncol. 2021 Apr;18(4):230-243. doi: 10.1038/s41571-020-00445-1. Epub 2020 Nov 20.
6
Application of optical character recognition with natural language processing for large-scale quality metric data extraction in colonoscopy reports.光学字符识别与自然语言处理在结肠镜报告中大规模质量度量数据提取的应用。
Gastrointest Endosc. 2021 Mar;93(3):750-757. doi: 10.1016/j.gie.2020.08.038. Epub 2020 Sep 3.
7
A Transparent and Adaptable Method to Extract Colonoscopy and Pathology Data Using Natural Language Processing.一种使用自然语言处理提取结肠镜检查和病理学数据的透明且可适应的方法。
J Med Syst. 2020 Jul 31;44(9):151. doi: 10.1007/s10916-020-01604-8.
8
Colorectal cancer statistics, 2020.2020 年结直肠癌统计数据。
CA Cancer J Clin. 2020 May;70(3):145-164. doi: 10.3322/caac.21601. Epub 2020 Mar 5.
9
Screening Colonoscopy Withdrawal Time Threshold for Adequate Proximal Serrated Polyp Detection Rate.筛查性结肠镜退镜时间阈值与近端锯齿状息肉检出率的关系。
Dig Dis Sci. 2018 Nov;63(11):3084-3090. doi: 10.1007/s10620-018-5187-0. Epub 2018 Jul 4.
10
Colorectal cancer screening for average-risk adults: 2018 guideline update from the American Cancer Society.美国癌症协会 2018 年普通风险成年人结直肠癌筛查指南更新
CA Cancer J Clin. 2018 Jul;68(4):250-281. doi: 10.3322/caac.21457. Epub 2018 May 30.

利用自然语言处理技术从病理学报告中提取结直肠息肉特征用于流行病学研究。

Leveraging Natural Language Processing to Extract Features of Colorectal Polyps From Pathology Reports for Epidemiologic Study.

机构信息

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT.

Huntsman Cancer Institute, University of Utah, Salt Lake City, UT.

出版信息

JCO Clin Cancer Inform. 2023 Jan;7:e2200131. doi: 10.1200/CCI.22.00131.

DOI:10.1200/CCI.22.00131
PMID:36753686
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10166420/
Abstract

PURPOSE

Histopathologic features are critical for studying risk factors of colorectal polyps, but remain deeply embedded within unstructured pathology reports, requiring costly and time-consuming manual abstraction for research. In this study, we developed and evaluated a natural language processing (NLP) pipeline to automatically extract histopathologic features of colorectal polyps from pathology reports, with an emphasis on individual polyp size. These data were then linked with structured electronic health record (EHR) data, creating an analysis-ready epidemiologic data set.

METHODS

We obtained 24,584 pathology reports from colonoscopies performed at the University of Utah's Gastroenterology Clinic. Two investigators annotated 350 reports to determine inter-rater agreement, develop an annotation scheme, and create a reference standard for performance evaluation. The pipeline was then developed, and performance was compared against the reference for extracting polyp location, histology, size, shape, dysplasia, and the number of polyps. Finally, the pipeline was applied to 24,225 unseen reports and NLP-extracted data were linked with structured EHR data.

RESULTS

Across all features, our pipeline achieved a precision of 98.9%, a recall of 98.0%, and an F1-score of 98.4%. In patients with polyps, the pipeline correctly extracted 95.6% of sizes, 97.2% of polyp locations, 97.8% of histology, 98.3% of shapes, and 98.3% of dysplasia levels. When applied to unseen data, the pipeline classified 12,889 patients as having polyps, 4,907 patients without polyps, and extracted the features of 28,387 polyps. Tubular adenomas were the most common subtype (55.9%), 8.1% of polyps were advanced adenomas, and the mean polyp size was 0.57 (±0.4) cm.

CONCLUSION

Our pipeline extracted histopathologic features of colorectal polyps from colonoscopy pathology reports, most notably individual polyp sizes, with considerable accuracy. This study demonstrates the utility of NLP for extracting polyp features and linking these data with EHR data to create an epidemiologic data set to study colorectal polyp risk factors and outcomes.

摘要

目的

组织病理学特征对于研究结直肠息肉的危险因素至关重要,但这些特征深埋于非结构化的病理报告中,需要耗费大量的时间和成本进行手动提取,以用于研究。在这项研究中,我们开发并评估了一种自然语言处理(NLP)管道,以便从病理报告中自动提取结直肠息肉的组织病理学特征,重点是单个息肉的大小。然后将这些数据与结构化电子健康记录(EHR)数据相关联,创建一个可用于分析的流行病学数据集。

方法

我们从犹他大学胃肠病学诊所进行的结肠镜检查中获取了 24584 份病理报告。两名研究人员对 350 份报告进行了注释,以确定组内一致性、制定注释方案,并为性能评估创建参考标准。然后开发了该管道,并将其性能与提取息肉位置、组织学、大小、形状、异型增生和息肉数量的参考标准进行了比较。最后,将该管道应用于 24225 份未见报告,并将 NLP 提取的数据与结构化 EHR 数据相关联。

结果

在所有特征中,我们的管道在提取息肉位置、组织学、大小、形状、异型增生和息肉数量方面的精度均达到 98.9%,召回率为 98.0%,F1 得分为 98.4%。在有息肉的患者中,该管道正确提取了 95.6%的息肉大小、97.2%的息肉位置、97.8%的组织学、98.3%的形状和 98.3%的异型增生水平。当应用于未见报告时,该管道将 12889 名患者归类为有息肉,4907 名患者无息肉,并提取了 28387 个息肉的特征。管状腺瘤是最常见的亚型(55.9%),8.1%的息肉为高级别腺瘤,平均息肉大小为 0.57(±0.4)cm。

结论

我们的管道从结肠镜检查病理报告中提取了结直肠息肉的组织病理学特征,尤其是单个息肉的大小,具有相当高的准确性。本研究证明了 NLP 用于提取息肉特征并将这些数据与 EHR 数据相关联以创建用于研究结直肠息肉危险因素和结果的流行病学数据集的实用性。