• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用R从关系数据库中提取病理报告数据,并以皮肤黑色素瘤报告的提取为例。

Pathology report data extraction from relational database using R, with extraction from reports on melanoma of skin as an example.

作者信息

Ye Jay J

机构信息

Dahl-Chase Pathology Associates, Bangor, Maine, USA.

出版信息

J Pathol Inform. 2016 Oct 21;7:44. doi: 10.4103/2153-3539.192822. eCollection 2016.

DOI:10.4103/2153-3539.192822
PMID:28066684
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5100200/
Abstract

BACKGROUND

Different methods have been described for data extraction from pathology reports with varying degrees of success. Here a technique for directly extracting data from relational database is described.

METHODS

Our department uses synoptic reports modified from College of American Pathologists (CAP) Cancer Protocol Templates to report most of our cancer diagnoses. Choosing the melanoma of skin synoptic report as an example, R scripting language extended with RODBC package was used to query the pathology information system database. Reports containing melanoma of skin synoptic report in the past 4 and a half years were retrieved and individual data elements were extracted. Using the retrieved list of the cases, the database was queried a second time to retrieve/extract the lymph node staging information in the subsequent reports from the same patients.

RESULTS

426 synoptic reports corresponding to unique lesions of melanoma of skin were retrieved, and data elements of interest were extracted into an R data frame. The distribution of Breslow depth of melanomas grouped by year is used as an example of intra-report data extraction and analysis. When the new pN staging information was present in the subsequent reports, 82% (77/94) was precisely retrieved (pN0, pN1, pN2 and pN3). Additional 15% (14/94) was retrieved with certain ambiguity (positive or knowing there was an update). The specificity was 100% for both. The relationship between Breslow depth and lymph node status was graphed as an example of lesion-specific multi-report data extraction and analysis.

CONCLUSIONS

R extended with RODBC package is a simple and versatile approach well-suited for the above tasks. The success or failure of the retrieval and extraction depended largely on whether the reports were formatted and whether the contents of the elements were consistently phrased. This approach can be easily modified and adopted for other pathology information systems that use relational database for data management.

摘要

背景

已有多种不同方法用于从病理报告中提取数据,其成功程度各异。本文介绍一种从关系数据库直接提取数据的技术。

方法

我们科室使用从美国病理学家学会(CAP)癌症协议模板修改而来的概要报告来报告大多数癌症诊断。以皮肤黑色素瘤概要报告为例,使用扩展了RODBC包的R脚本语言查询病理信息系统数据库。检索过去4年半内包含皮肤黑色素瘤概要报告的病例,并提取各个数据元素。利用检索到的病例列表,再次查询数据库以从同一患者的后续报告中检索/提取淋巴结分期信息。

结果

检索到426份对应皮肤黑色素瘤独特病变的概要报告,并将感兴趣的数据元素提取到一个R数据框中。以按年份分组的黑色素瘤Breslow深度分布为例进行报告内数据提取和分析。当后续报告中有新的pN分期信息时,82%(77/94)被准确检索到(pN0、pN1、pN2和pN3)。另外15%(14/94)的检索存在一定模糊性(阳性或知道有更新)。两者的特异性均为100%。以Breslow深度与淋巴结状态之间的关系作图为例进行病变特异性多报告数据提取和分析。

结论

扩展了RODBC包的R是一种简单且通用的方法,非常适合上述任务。检索和提取的成败很大程度上取决于报告的格式以及元素内容的表述是否一致。这种方法可以很容易地修改并应用于其他使用关系数据库进行数据管理的病理信息系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/ee4f5814603c/JPI-7-44-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/e8c0f5d9c77f/JPI-7-44-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/fc098c546fc7/JPI-7-44-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/6fd9b8da29f5/JPI-7-44-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/ec9d9159daf2/JPI-7-44-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/4aff3a38a2c2/JPI-7-44-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/ee4f5814603c/JPI-7-44-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/e8c0f5d9c77f/JPI-7-44-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/fc098c546fc7/JPI-7-44-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/6fd9b8da29f5/JPI-7-44-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/ec9d9159daf2/JPI-7-44-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/4aff3a38a2c2/JPI-7-44-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/5100200/ee4f5814603c/JPI-7-44-g006.jpg

相似文献

1
Pathology report data extraction from relational database using R, with extraction from reports on melanoma of skin as an example.使用R从关系数据库中提取病理报告数据,并以皮肤黑色素瘤报告的提取为例。
J Pathol Inform. 2016 Oct 21;7:44. doi: 10.4103/2153-3539.192822. eCollection 2016.
2
Synoptic tool for reporting of hematological and lymphoid neoplasms based on World Health Organization classification and College of American Pathologists checklist.基于世界卫生组织分类和美国病理学家学会检查表的血液和淋巴肿瘤报告概要工具。
BMC Cancer. 2007 Jul 31;7:144. doi: 10.1186/1471-2407-7-144.
3
Web-based synoptic reporting for cancer checklists.基于网络的癌症检查清单概要报告。
J Pathol Inform. 2011 Mar 15;2:16. doi: 10.4103/2153-3539.78039.
4
Extraction and analysis of discrete synoptic pathology report data using R.使用R语言提取和分析离散的概要病理报告数据。
J Pathol Inform. 2015 Nov 27;6:62. doi: 10.4103/2153-3539.170649. eCollection 2015.
5
Functional Assessment of Synoptic Pathology Reporting for Ovarian Cancer.卵巢癌概要病理报告的功能评估
Pathobiology. 2016;83(2-3):70-8. doi: 10.1159/000443176. Epub 2016 Apr 26.
6
Impact of template-based synoptic reporting on completeness of surgical pathology reports.基于模板的概要报告对手术病理报告完整性的影响。
Virchows Arch. 2024 Jan;484(1):31-36. doi: 10.1007/s00428-023-03533-6. Epub 2023 Apr 5.
7
Support patient search on pathology reports with interactive online learning based data extraction.通过基于交互式在线学习的数据提取来支持对病理报告的患者搜索。
J Pathol Inform. 2015 Sep 28;6:51. doi: 10.4103/2153-3539.166012. eCollection 2015.
8
The advantage of using a synoptic pathology report format for cutaneous melanoma.使用皮肤黑色素瘤概要病理报告格式的优势。
Histopathology. 2008 Jan;52(2):130-8. doi: 10.1111/j.1365-2559.2007.02921.x.
9
Recommendations for the reporting of pleural mesothelioma.胸膜间皮瘤报告建议。
Hum Pathol. 2007 Nov;38(11):1587-9. doi: 10.1016/j.humpath.2006.11.008. Epub 2007 Feb 2.
10
The new 8th edition of TNM staging and its implications for skin cancer: a review by the British Association of Dermatologists and the Royal College of Pathologists, U.K.第 8 版 TNM 分期及其对皮肤癌的影响:英国皮肤科医师协会和英国皇家病理学院的综述
Br J Dermatol. 2018 Oct;179(4):824-828. doi: 10.1111/bjd.16892. Epub 2018 Sep 5.

引用本文的文献

1
Multiplex Immunofluorescence Tyramide Signal Amplification for Immune Cell Profiling of Paraffin-Embedded Tumor Tissues.用于石蜡包埋肿瘤组织免疫细胞分析的多重免疫荧光酪胺信号放大技术
Front Mol Biosci. 2021 Apr 29;8:667067. doi: 10.3389/fmolb.2021.667067. eCollection 2021.
2
Computational Algorithms that Effectively Reduce Report Defects in Surgical Pathology.有效减少外科病理学报告缺陷的计算算法
J Pathol Inform. 2019 Jul 1;10:20. doi: 10.4103/jpi.jpi_17_19. eCollection 2019.
3
Construction and Utilization of a Neural Network Model to Predict Current Procedural Terminology Codes from Pathology Report Texts.

本文引用的文献

1
Extraction and analysis of discrete synoptic pathology report data using R.使用R语言提取和分析离散的概要病理报告数据。
J Pathol Inform. 2015 Nov 27;6:62. doi: 10.4103/2153-3539.170649. eCollection 2015.
2
Validation of natural language processing to extract breast cancer pathology procedures and results.用于提取乳腺癌病理程序和结果的自然语言处理的验证
J Pathol Inform. 2015 Jun 23;6:38. doi: 10.4103/2153-3539.159215. eCollection 2015.
3
Standardized synoptic cancer pathology reports - so what and who cares? A population-based satisfaction survey of 970 pathologists, surgeons, and oncologists.
用于从病理报告文本预测当前操作术语代码的神经网络模型的构建与应用
J Pathol Inform. 2019 Apr 3;10:13. doi: 10.4103/jpi.jpi_3_19. eCollection 2019.
4
Population-Based Analysis of Histologically Confirmed Melanocytic Proliferations Using Natural Language Processing.基于自然语言处理的组织学证实黑素细胞增生的人群分析。
JAMA Dermatol. 2018 Jan 1;154(1):24-29. doi: 10.1001/jamadermatol.2017.4060.
5
Performance of a Web-based Method for Generating Synoptic Reports.一种基于网络的生成概要报告方法的性能
J Pathol Inform. 2017 Mar 10;8:13. doi: 10.4103/jpi.jpi_91_16. eCollection 2017.
标准化肿瘤病理学综合报告——那么,什么是标准化肿瘤病理学综合报告,谁会关心呢?一项基于人群的 970 名病理学家、外科医生和肿瘤学家满意度调查。
Arch Pathol Lab Med. 2013 Nov;137(11):1599-602. doi: 10.5858/arpa.2012-0656-OA. Epub 2013 Feb 21.
4
The feasibility of using natural language processing to extract clinical information from breast pathology reports.利用自然语言处理从乳腺病理报告中提取临床信息的可行性。
J Pathol Inform. 2012;3:23. doi: 10.4103/2153-3539.97788. Epub 2012 Jun 30.
5
What impact has the introduction of a synoptic report for rectal cancer had on reporting outcomes for specialist gastrointestinal and nongastrointestinal pathologists?直肠癌概要报告的引入对专业胃肠和非胃肠病理学家报告结果有何影响?
Arch Pathol Lab Med. 2011 Nov;135(11):1471-5. doi: 10.5858/arpa.2010-0558-OA.
6
Web-based synoptic reporting for cancer checklists.基于网络的癌症检查清单概要报告。
J Pathol Inform. 2011 Mar 15;2:16. doi: 10.4103/2153-3539.78039.
7
The 2009 version of the cancer protocols of the college of american pathologists.美国病理学家学会2009年版癌症诊疗规范
Arch Pathol Lab Med. 2010 Mar;134(3):326-30. doi: 10.5858/134.3.326.