• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用上下文感知的基于规则的分类器对出院小结中的疾病进行语义分类。

Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier.

作者信息

Solt Illés, Tikk Domonkos, Gál Viktor, Kardkovács Zsolt T

机构信息

Department of Media Informatics and Telematics, Budapest University of Technology and Economics, Budapest, Hungary.

出版信息

J Am Med Inform Assoc. 2009 Jul-Aug;16(4):580-4. doi: 10.1197/jamia.M3087. Epub 2009 Apr 23.

DOI:10.1197/jamia.M3087
PMID:19390101
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2705263/
Abstract

OBJECTIVE Automated and disease-specific classification of textual clinical discharge summaries is of great importance in human life science, as it helps physicians to make medical studies by providing statistically relevant data for analysis. This can be further facilitated if, at the labeling of discharge summaries, semantic labels are also extracted from text, such as whether a given disease is present, absent, questionable in a patient, or is unmentioned in the document. The authors present a classification technique that successfully solves the semantic classification task. DESIGN The authors introduce a context-aware rule-based semantic classification technique for use on clinical discharge summaries. The classification is performed in subsequent steps. First, some misleading parts are removed from the text; then the text is partitioned into positive, negative, and uncertain context segments, then a sequence of binary classifiers is applied to assign the appropriate semantic labels. Measurement For evaluation the authors used the documents of the i2b2 Obesity Challenge and adopted its evaluation measures: F(1)-macro and F(1)-micro for measurements. RESULTS On the two subtasks of the Obesity Challenge (textual and intuitive classification) the system performed very well, and achieved a F(1)-macro = 0.80 for the textual and F(1)-macro = 0.67 for the intuitive tasks, and obtained second place at the textual and first place at the intuitive subtasks of the challenge. CONCLUSIONS The authors show in the paper that a simple rule-based classifier can tackle the semantic classification task more successfully than machine learning techniques, if the training data are limited and some semantic labels are very sparse.

摘要

目的 文本临床出院小结的自动且针对疾病的分类在人类生命科学中极为重要,因为它通过提供具有统计相关性的数据进行分析,帮助医生开展医学研究。如果在出院小结标注时,还能从文本中提取语义标签,比如患者是否患有某种特定疾病、未患、情况存疑或文档中未提及,这将进一步推动相关工作。作者提出了一种成功解决语义分类任务的分类技术。

设计 作者引入了一种基于上下文感知规则的语义分类技术,用于临床出院小结。分类分后续几个步骤进行。首先,从文本中去除一些误导性部分;然后将文本划分为正、负和不确定上下文片段,接着应用一系列二元分类器来分配适当的语义标签。

测量 为进行评估,作者使用了i2b2肥胖挑战的文档并采用其评估指标:用于测量的F(1)-宏和F(1)-微。

结果 在肥胖挑战的两个子任务(文本分类和直观分类)上,该系统表现出色,在文本分类任务中F(1)-宏 = 0.80,在直观任务中F(1)-宏 = 0.67,在挑战的文本子任务中获得第二名,在直观子任务中获得第一名。

结论 作者在论文中表明,如果训练数据有限且一些语义标签非常稀疏,那么一个简单的基于规则的分类器比机器学习技术能更成功地处理语义分类任务。

相似文献

1
Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier.使用上下文感知的基于规则的分类器对出院小结中的疾病进行语义分类。
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):580-4. doi: 10.1197/jamia.M3087. Epub 2009 Apr 23.
2
A text mining approach to the prediction of disease status from clinical discharge summaries.一种从临床出院小结预测疾病状态的文本挖掘方法。
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):596-600. doi: 10.1197/jamia.M3096. Epub 2009 Apr 23.
3
Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries.基于 FHIR 的电子健康记录表型框架的开发:以从出院小结中识别肥胖且伴有多种合并症的患者为例。
J Biomed Inform. 2019 Nov;99:103310. doi: 10.1016/j.jbi.2019.103310. Epub 2019 Oct 14.
4
Recognizing obesity and comorbidities in sparse data.在稀疏数据中识别肥胖及合并症。
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):561-70. doi: 10.1197/jamia.M3115. Epub 2009 Apr 23.
5
A rule-based approach for identifying obesity and its comorbidities in medical discharge summaries.一种基于规则的方法,用于在出院小结中识别肥胖及其合并症。
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):576-9. doi: 10.1197/jamia.M3086. Epub 2009 Apr 23.
6
A classification approach to coreference in discharge summaries: 2011 i2b2 challenge.一种用于出院小结中核心参照的分类方法:2011 i2b2 挑战赛。
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):897-905. doi: 10.1136/amiajnl-2011-000734. Epub 2012 Apr 13.
7
A system for classifying disease comorbidity status from medical discharge summaries using automated hotspot and negated concept detection.一种利用自动热点和否定概念检测从医疗出院小结中对疾病共病状态进行分类的系统。
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):590-5. doi: 10.1197/jamia.M3095. Epub 2009 Apr 23.
8
A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries.基于机器学习的方法从出院小结中提取临床实体及其断言的研究。
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):601-6. doi: 10.1136/amiajnl-2011-000163. Epub 2011 Apr 20.
9
Recognition of medication information from discharge summaries using ensembles of classifiers.使用分类器集成识别出院小结中的药物信息。
BMC Med Inform Decis Mak. 2012 May 7;12:36. doi: 10.1186/1472-6947-12-36.
10
Use of semantic features to classify patient smoking status.利用语义特征对患者吸烟状况进行分类。
AMIA Annu Symp Proc. 2008 Nov 6;2008:450-4.

引用本文的文献

1
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.将自然语言处理应用于临床数据仓库中的文本数据:系统评价。
JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.
2
A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance.深度学习模型在不同类别不平衡程度的非结构化医疗记录文本分类中的对比研究。
BMC Med Res Methodol. 2022 Jul 2;22(1):181. doi: 10.1186/s12874-022-01665-y.
3
Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes.使用机器学习和临床记录预测危重症糖尿病患者的死亡率。
BMC Med Inform Decis Mak. 2020 Dec 30;20(Suppl 11):295. doi: 10.1186/s12911-020-01318-4.
4
Automatic Labeled Dialogue Generation for Nursing Record Systems.用于护理记录系统的自动标注对话生成
J Pers Med. 2020 Jul 16;10(3):62. doi: 10.3390/jpm10030062.
5
Machine learning for syndromic surveillance using veterinary necropsy reports.利用兽医剖检报告进行综合征监测的机器学习。
PLoS One. 2020 Feb 5;15(2):e0228105. doi: 10.1371/journal.pone.0228105. eCollection 2020.
6
Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries.基于 FHIR 的电子健康记录表型框架的开发:以从出院小结中识别肥胖且伴有多种合并症的患者为例。
J Biomed Inform. 2019 Nov;99:103310. doi: 10.1016/j.jbi.2019.103310. Epub 2019 Oct 14.
7
Developing a portable natural language processing based phenotyping system.开发一个基于自然语言处理的便携式表型系统。
BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):78. doi: 10.1186/s12911-019-0786-z.
8
Clinical text classification with rule-based features and knowledge-guided convolutional neural networks.基于规则特征和知识引导卷积神经网络的临床文本分类。
BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):71. doi: 10.1186/s12911-019-0781-4.
9
Extracting Information from Electronic Medical Records to Identify the Obesity Status of a Patient Based on Comorbidities and Bodyweight Measures.从电子病历中提取信息以根据合并症和体重测量结果确定患者的肥胖状况。
J Med Syst. 2016 Aug;40(8):191. doi: 10.1007/s10916-016-0548-8. Epub 2016 Jul 11.
10
Semantic biomedical resource discovery: a Natural Language Processing framework.语义生物医学资源发现:一种自然语言处理框架。
BMC Med Inform Decis Mak. 2015 Sep 30;15:77. doi: 10.1186/s12911-015-0200-4.

本文引用的文献

1
Recognizing obesity and comorbidities in sparse data.在稀疏数据中识别肥胖及合并症。
J Am Med Inform Assoc. 2009 Jul-Aug;16(4):561-70. doi: 10.1197/jamia.M3115. Epub 2009 Apr 23.
2
Automatic construction of rule-based ICD-9-CM coding systems.基于规则的ICD-9-CM编码系统的自动构建。
BMC Bioinformatics. 2008 Apr 11;9 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2105-9-S3-S10.
3
Building a hospital referral expert system with a Prediction and Optimization-Based Decision Support System algorithm.使用基于预测与优化的决策支持系统算法构建医院转诊专家系统。
J Biomed Inform. 2008 Apr;41(2):371-86. doi: 10.1016/j.jbi.2007.10.002. Epub 2007 Oct 22.
4
Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system.提取用于哮喘研究的主要诊断、合并症和吸烟状况:自然语言处理系统的评估
BMC Med Inform Decis Mak. 2006 Jul 26;6:30. doi: 10.1186/1472-6947-6-30.
5
AliBaba: PubMed as a graph.阿里巴巴:作为图的PubMed。
Bioinformatics. 2006 Oct 1;22(19):2444-5. doi: 10.1093/bioinformatics/btl408. Epub 2006 Jul 26.
6
Resolving abbreviations to their senses in Medline.在医学文献数据库(Medline)中解析缩写词的含义。
Bioinformatics. 2005 Sep 15;21(18):3658-64. doi: 10.1093/bioinformatics/bti586. Epub 2005 Jul 21.
7
A survey of current work in biomedical text mining.生物医学文本挖掘的当前工作调查。
Brief Bioinform. 2005 Mar;6(1):57-71. doi: 10.1093/bib/6.1.57.
8
Text-mining approaches in molecular biology and biomedicine.分子生物学和生物医学中的文本挖掘方法。
Drug Discov Today. 2005 Mar 15;10(6):439-45. doi: 10.1016/S1359-6446(05)03376-3.
9
A simple algorithm for identifying negated findings and diseases in discharge summaries.一种用于识别出院小结中否定性检查结果和疾病的简单算法。
J Biomed Inform. 2001 Oct;34(5):301-10. doi: 10.1006/jbin.2001.1029.
10
The Unified Medical Language System.统一医学语言系统
Methods Inf Med. 1993 Aug;32(4):281-91. doi: 10.1055/s-0038-1634945.