• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用带有分层正则化的神经多任务训练为病理报告分配ICD-O-3编码。

Assigning ICD-O-3 Codes to Pathology Reports using Neural Multi-Task Training with Hierarchical Regularization.

作者信息

Rios Anthony, Durbin Eric B, Hands Isaac, Kavuluru Ramakanth

机构信息

Dept. of Information Systems & Cyber Security, Cyber Center for Security & Analytics, University of Texas at San Antonio, San Antonio, Texas, USA.

Division of Biomedical Informatics (Internal Medicine), Kentucky Cancer Registry, University of Kentucky, Lexington, Kentucky, USA.

出版信息

ACM BCB. 2021 Aug;2021. doi: 10.1145/3459930.3469541.

DOI:10.1145/3459930.3469541
PMID:34541582
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8445227/
Abstract

Tracking population-level cancer information is essential for researchers, clinicians, policymakers, and the public. Unfortunately, much of the information is stored as unstructured data in pathology reports. Thus, too process the information, we require either automated extraction techniques or manual curation. Moreover, many of the cancer-related concepts appear infrequently in real-world training datasets. Automated extraction is difficult because of the limited data. This study introduces a novel technique that incorporates structured expert knowledge to improve histology and topography code classification models. Using pathology reports collected from the Kentucky Cancer Registry, we introduce a novel multi-task training approach with hierarchical regularization that incorporates structured information about the International Classification of Diseases for Oncology, 3rd Edition classes to improve predictive performance. Overall, we find that our method improves both micro and macro F1. For macro F1, we achieve up to a 6% absolute improvement for topography codes and up to 4% absolute improvement for histology codes.

摘要

追踪人群层面的癌症信息对研究人员、临床医生、政策制定者和公众来说至关重要。不幸的是,大部分信息都以非结构化数据的形式存储在病理报告中。因此,为了处理这些信息,我们需要自动化提取技术或人工整理。此外,许多与癌症相关的概念在现实世界的训练数据集中很少出现。由于数据有限,自动化提取很困难。本研究引入了一种新颖的技术,该技术结合结构化专家知识来改进组织学和地形学代码分类模型。利用从肯塔基癌症登记处收集的病理报告,我们引入了一种具有分层正则化的新颖多任务训练方法,该方法纳入了有关《国际疾病分类肿瘤学》第三版类别的结构化信息,以提高预测性能。总体而言,我们发现我们的方法提高了微观和宏观F1值。对于宏观F1值,我们在地形学代码方面实现了高达6%的绝对提升,在组织学代码方面实现了高达4%的绝对提升。

相似文献

1
Assigning ICD-O-3 Codes to Pathology Reports using Neural Multi-Task Training with Hierarchical Regularization.使用带有分层正则化的神经多任务训练为病理报告分配ICD-O-3编码。
ACM BCB. 2021 Aug;2021. doi: 10.1145/3459930.3469541.
2
Neural transfer learning for assigning diagnosis codes to EMRs.将诊断编码分配给电子病历的神经迁移学习。
Artif Intell Med. 2019 May;96:116-122. doi: 10.1016/j.artmed.2019.04.002. Epub 2019 Apr 12.
3
A Question-and-Answer System to Extract Data From Free-Text Oncological Pathology Reports (CancerBERT Network): Development Study.从自由文本肿瘤病理学报告(CancerBERT 网络)中提取数据的问答系统:开发研究。
J Med Internet Res. 2022 Mar 23;24(3):e27210. doi: 10.2196/27210.
4
Automated classification of cancer morphology from Italian pathology reports using Natural Language Processing techniques: A rule-based approach.基于自然语言处理技术的意大利病理报告中癌症形态的自动分类:一种基于规则的方法。
J Biomed Inform. 2021 Apr;116:103712. doi: 10.1016/j.jbi.2021.103712. Epub 2021 Feb 18.
5
Validation of Case Finding Algorithms for Hepatocellular Cancer From Administrative Data and Electronic Health Records Using Natural Language Processing.使用自然语言处理技术从行政数据和电子健康记录中验证肝细胞癌病例发现算法
Med Care. 2016 Feb;54(2):e9-14. doi: 10.1097/MLR.0b013e3182a30373.
6
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.
7
Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.使用多任务卷积神经网络从自由文本病理报告中自动提取癌症登记报告信息。
J Am Med Inform Assoc. 2020 Jan 1;27(1):89-98. doi: 10.1093/jamia/ocz153.
8
Automated Classification of Semi-Structured Pathology Reports into ICD-O Using SVM in Portuguese.使用支持向量机将半结构化病理报告自动分类为葡萄牙语的国际疾病分类肿瘤学专辑(ICD-O)编码
Stud Health Technol Inform. 2017;235:256-260.
9
Automatic Extraction of ICD-O-3 Primary Sites from Cancer Pathology Reports.从癌症病理报告中自动提取ICD-O-3原发部位
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:112-6. eCollection 2013.
10
Hierarchical attention networks for information extraction from cancer pathology reports.用于从癌症病理报告中提取信息的分层注意力网络。
J Am Med Inform Assoc. 2018 Mar 1;25(3):321-330. doi: 10.1093/jamia/ocx131.

引用本文的文献

1
DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction.DeepPhe-CR:用于癌症登记员病例提取的自然语言处理软件服务。
JCO Clin Cancer Inform. 2023 Sep;7:e2300156. doi: 10.1200/CCI.23.00156.
2
DeepPhe-CR: Natural Language Processing Software Services for Cancer Registrar Case Abstraction.DeepPhe-CR:用于癌症登记员病例摘要的自然语言处理软件服务。
medRxiv. 2023 Oct 26:2023.05.05.23289524. doi: 10.1101/2023.05.05.23289524.

本文引用的文献

1
Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.跨癌症登记处的深度迁移学习用于从病理报告中提取信息
IEEE EMBS Int Conf Biomed Health Inform. 2019 May;2019. doi: 10.1109/bhi.2019.8834586. Epub 2019 Sep 12.
2
Deep active learning for classifying cancer pathology reports.深度学习在癌症病理报告分类中的应用。
BMC Bioinformatics. 2021 Mar 9;22(1):113. doi: 10.1186/s12859-021-04047-1.
3
Accelerated training of bootstrap aggregation-based deep information extraction systems from cancer pathology reports.基于自助聚合的癌症病理报告深度信息提取系统的加速训练
J Biomed Inform. 2020 Oct;110:103564. doi: 10.1016/j.jbi.2020.103564. Epub 2020 Sep 9.
4
Automatic extraction of cancer registry reportable information from free-text pathology reports using multitask convolutional neural networks.使用多任务卷积神经网络从自由文本病理报告中自动提取癌症登记报告信息。
J Am Med Inform Assoc. 2020 Jan 1;27(1):89-98. doi: 10.1093/jamia/ocz153.
5
Deep learning in bioinformatics: Introduction, application, and perspective in the big data era.深度学习在生物信息学中的应用:大数据时代的介绍、应用和展望。
Methods. 2019 Aug 15;166:4-21. doi: 10.1016/j.ymeth.2019.04.008. Epub 2019 Apr 22.
6
Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces.用于结构化标签空间的少样本和零样本多标签学习
Proc Conf Empir Methods Nat Lang Process. 2018 Oct-Nov;2018:3132-3142.
7
EMR Coding with Semi-Parametric Multi-Head Matching Networks.基于半参数多头匹配网络的电子病历编码
Proc Conf. 2018 Jun;2018:2081-2091. doi: 10.18653/v1/N18-1189.
8
A Survey of Data Mining and Deep Learning in Bioinformatics.生物信息学中的数据挖掘和深度学习调查。
J Med Syst. 2018 Jun 28;42(8):139. doi: 10.1007/s10916-018-1003-9.
9
Hierarchical attention networks for information extraction from cancer pathology reports.用于从癌症病理报告中提取信息的分层注意力网络。
J Am Med Inform Assoc. 2018 Mar 1;25(3):321-330. doi: 10.1093/jamia/ocx131.
10
Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports.深度学习在癌症病理报告中自动提取原发部位的应用
IEEE J Biomed Health Inform. 2018 Jan;22(1):244-251. doi: 10.1109/JBHI.2017.2700722. Epub 2017 May 3.