• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用自然语言处理技术从马来亚大学医学中心的叙述性病理报告中自动生成概要报告

Automated Generation of Synoptic Reports from Narrative Pathology Reports in University Malaya Medical Centre Using Natural Language Processing.

作者信息

Tan Wee-Ming, Teoh Kean-Hooi, Ganggayah Mogana Darshini, Taib Nur Aishah, Zaini Hana Salwani, Dhillon Sarinder Kaur

机构信息

Data Science & Bioinformatics Laboratory, Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur 50603, Malaysia.

Laboratory Department, Sunway Medical Centre, Bandar Sunway 47500, Malaysia.

出版信息

Diagnostics (Basel). 2022 Apr 1;12(4):879. doi: 10.3390/diagnostics12040879.

DOI:10.3390/diagnostics12040879
PMID:35453927
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9027647/
Abstract

Pathology reports represent a primary source of information for cancer registries. University Malaya Medical Centre (UMMC) is a tertiary hospital responsible for training pathologists; thus narrative reporting becomes important. However, the unstructured free-text reports made the information extraction process tedious for clinical audits and data analysis-related research. This study aims to develop an automated natural language processing (NLP) algorithm to summarize the existing narrative breast pathology report from UMMC to a narrower structured synoptic pathology report with a checklist-style report template to ease the creation of pathology reports. The development of the rule-based NLP algorithm was based on the R programming language by using 593 pathology specimens from 174 patients provided by the Department of Pathology, UMMC. The pathologist provides specific keywords for data elements to define the semantic rules of the NLP. The system was evaluated by calculating the precision, recall, and F1-score. The proposed NLP algorithm achieved a micro-F1 score of 99.50% and a macro-F1 score of 98.97% on 178 specimens with 25 data elements. This achievement correlated to clinicians' needs, which could improve communication between pathologists and clinicians. The study presented here is significant, as structured data is easily minable and could generate important insights.

摘要

病理报告是癌症登记处的主要信息来源。马来亚大学医学中心(UMMC)是一家负责培训病理学家的三级医院;因此,叙述性报告变得很重要。然而,非结构化的自由文本报告使得临床审计和数据分析相关研究的信息提取过程变得繁琐。本研究旨在开发一种自动化自然语言处理(NLP)算法,将UMMC现有的叙述性乳腺病理报告总结为结构更紧凑的清单式报告模板的概要病理报告,以简化病理报告的创建。基于规则的NLP算法的开发基于R编程语言,使用了UMMC病理科提供的174名患者的593份病理标本。病理学家为数据元素提供特定的关键词,以定义NLP的语义规则。通过计算精确率、召回率和F1分数对该系统进行评估。所提出的NLP算法在包含25个数据元素的178份标本上实现了99.50%的微观F1分数和98.97%的宏观F1分数。这一成果符合临床医生的需求,能够改善病理学家和临床医生之间的沟通。这里介绍的研究具有重要意义,因为结构化数据易于挖掘,并且可以产生重要的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/54f9a3fef997/diagnostics-12-00879-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/fd9de99dc9c2/diagnostics-12-00879-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/f8424e5516ce/diagnostics-12-00879-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/d9bade4b408d/diagnostics-12-00879-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/54f9a3fef997/diagnostics-12-00879-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/fd9de99dc9c2/diagnostics-12-00879-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/f8424e5516ce/diagnostics-12-00879-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/d9bade4b408d/diagnostics-12-00879-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05b8/9027647/54f9a3fef997/diagnostics-12-00879-g004.jpg

相似文献

1
Automated Generation of Synoptic Reports from Narrative Pathology Reports in University Malaya Medical Centre Using Natural Language Processing.利用自然语言处理技术从马来亚大学医学中心的叙述性病理报告中自动生成概要报告
Diagnostics (Basel). 2022 Apr 1;12(4):879. doi: 10.3390/diagnostics12040879.
2
Natural language processing in narrative breast radiology reporting in University Malaya Medical Centre.马来西亚大学医学中心乳腺影像学叙述性报告中的自然语言处理
Health Informatics J. 2023 Jul-Sep;29(3):14604582231203763. doi: 10.1177/14604582231203763.
3
Automated classification of cancer morphology from Italian pathology reports using Natural Language Processing techniques: A rule-based approach.基于自然语言处理技术的意大利病理报告中癌症形态的自动分类:一种基于规则的方法。
J Biomed Inform. 2021 Apr;116:103712. doi: 10.1016/j.jbi.2021.103712. Epub 2021 Feb 18.
4
Programming techniques for improving rule readability for rule-based information extraction natural language processing pipelines of unstructured and semi-structured medical texts.用于改进基于规则的信息抽取自然语言处理管道的规则可读性的编程技术,这些管道处理非结构化和半结构化的医学文本。
Health Informatics J. 2023 Apr-Jun;29(2):14604582231164696. doi: 10.1177/14604582231164696.
5
Automated medical chart review for breast cancer outcomes research: a novel natural language processing extraction system.自动化医疗图表审查在乳腺癌结局研究中的应用:一种新颖的自然语言处理提取系统。
BMC Med Res Methodol. 2022 May 12;22(1):136. doi: 10.1186/s12874-022-01583-z.
6
Functional Assessment of Synoptic Pathology Reporting for Ovarian Cancer.卵巢癌概要病理报告的功能评估
Pathobiology. 2016;83(2-3):70-8. doi: 10.1159/000443176. Epub 2016 Apr 26.
7
Natural Language Processing for Surveillance of Cervical and Anal Cancer and Precancer: Algorithm Development and Split-Validation Study.用于宫颈癌和肛门癌及癌前病变监测的自然语言处理:算法开发与分割验证研究
JMIR Med Inform. 2020 Nov 3;8(11):e20826. doi: 10.2196/20826.
8
Synoptic Reporting by Summarizing Cancer Pathology Reports using Large Language Models.使用大语言模型总结癌症病理报告进行概要报告
medRxiv. 2024 May 9:2024.04.26.24306452. doi: 10.1101/2024.04.26.24306452.
9
Ambiguous and Incomplete: Natural Language Processing Reveals Problematic Reporting Styles in Thyroid Ultrasound Reports.模糊与不完整:自然语言处理揭示甲状腺超声报告中的问题报告风格。
Methods Inf Med. 2022 May;61(1-02):11-18. doi: 10.1055/s-0041-1740493. Epub 2022 Jan 6.
10
Automatic extraction of imaging observation and assessment categories from breast magnetic resonance imaging reports with natural language processing.基于自然语言处理的乳腺磁共振成像报告中成像观察和评估类别的自动提取。
Chin Med J (Engl). 2019 Jul 20;132(14):1673-1680. doi: 10.1097/CM9.0000000000000301.

引用本文的文献

1
Using Generative AI to Extract Structured Information from Free Text Pathology Reports.使用生成式人工智能从自由文本病理报告中提取结构化信息。
J Med Syst. 2025 Mar 13;49(1):36. doi: 10.1007/s10916-025-02167-2.
2
Developing a named entity framework for thyroid cancer staging and risk level classification using large language models.使用大语言模型开发用于甲状腺癌分期和风险水平分类的命名实体框架。
NPJ Digit Med. 2025 Mar 1;8(1):134. doi: 10.1038/s41746-025-01528-y.
3
Assessment of Breast Pathology Reporting Needs and Development of Tumor Synoptic Templates in Sub-Saharan Africa.

本文引用的文献

1
Comparison of Machine-Learning Algorithms for the Prediction of Current Procedural Terminology (CPT) Codes from Pathology Reports.用于从病理报告预测当前操作术语(CPT)代码的机器学习算法比较
J Pathol Inform. 2022 Jan 5;13:3. doi: 10.4103/jpi.jpi_52_21. eCollection 2022.
2
Automated classification of cancer morphology from Italian pathology reports using Natural Language Processing techniques: A rule-based approach.基于自然语言处理技术的意大利病理报告中癌症形态的自动分类:一种基于规则的方法。
J Biomed Inform. 2021 Apr;116:103712. doi: 10.1016/j.jbi.2021.103712. Epub 2021 Feb 18.
3
Analysis of Stroke Detection during the COVID-19 Pandemic Using Natural Language Processing of Radiology Reports.
撒哈拉以南非洲地区乳腺病理报告需求评估及肿瘤概要模板的制定
Arch Pathol Lab Med. 2025 Apr 1;149(4):340-346. doi: 10.5858/arpa.2024-0101-OA.
4
Year 2022 in Medical Natural Language Processing: Availability of Language Models as a Step in the Democratization of NLP in the Biomedical Area.2022 年医学自然语言处理:语言模型的可用性是生物医学领域 NLP 民主化的一步。
Yearb Med Inform. 2023 Aug;32(1):244-252. doi: 10.1055/s-0043-1768752. Epub 2023 Dec 26.
利用放射学报告的自然语言处理分析 COVID-19 大流行期间的中风检测。
AJNR Am J Neuroradiol. 2021 Mar;42(3):429-434. doi: 10.3174/ajnr.A6961. Epub 2020 Dec 17.
4
Phenotyping severity of patient-centered outcomes using clinical notes: A prostate cancer use case.利用临床记录对以患者为中心的结局的严重程度进行表型分析:一个前列腺癌的应用案例。
Learn Health Syst. 2020 Jul 17;4(4):e10237. doi: 10.1002/lrh2.10237. eCollection 2020 Oct.
5
Classifying cancer pathology reports with hierarchical self-attention networks.基于层次自注意力网络的癌症病理报告分类。
Artif Intell Med. 2019 Nov;101:101726. doi: 10.1016/j.artmed.2019.101726. Epub 2019 Oct 15.
6
Deep learning to convert unstructured CT pulmonary angiography reports into structured reports.深度学习将非结构化CT肺血管造影报告转换为结构化报告。
Eur Radiol Exp. 2019 Sep 23;3(1):37. doi: 10.1186/s41747-019-0118-1.
7
Automating the Capture of Structured Pathology Data for Prostate Cancer Clinical Care and Research.为前列腺癌临床护理与研究自动采集结构化病理数据
JCO Clin Cancer Inform. 2019 Jul;3:1-8. doi: 10.1200/CCI.18.00084.
8
The Oncologist's Guide to Synoptic Reporting: A Primer.肿瘤学家的概要报告指南:入门篇。
Oncology. 2020;98(6):396-402. doi: 10.1159/000500884. Epub 2019 Jun 7.
9
Synoptic reporting increases quality of upper gastrointestinal cancer pathology reports.摘要报告提高了上消化道癌症病理报告的质量。
Virchows Arch. 2019 Aug;475(2):255-259. doi: 10.1007/s00428-019-02586-w. Epub 2019 May 29.
10
Natural language processing to identify ureteric stones in radiology reports.利用自然语言处理技术在放射学报告中识别输尿管结石。
J Med Imaging Radiat Oncol. 2019 Jun;63(3):307-310. doi: 10.1111/1754-9485.12861. Epub 2019 Feb 5.