文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于词袋模型和监督降维算法的慢性病患者多标签分类

Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.

作者信息

Bromuri Stefano, Zufferey Damien, Hennebert Jean, Schumacher Michael

机构信息

University of Applied Sciences Western Switzerland, Institute of Business Information Systems, TechnoArk 3, CH-3960 Sierre, Switzerland.

University of Applied Sciences Western Switzerland, Institute of Business Information Systems, TechnoArk 3, CH-3960 Sierre, Switzerland.

出版信息

J Biomed Inform. 2014 Oct;51:165-75. doi: 10.1016/j.jbi.2014.05.010. Epub 2014 May 29.


DOI:10.1016/j.jbi.2014.05.010
PMID:24879897
Abstract

OBJECTIVE: This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. METHODS: We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. RESULTS: Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. CONCLUSIONS: The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density.

摘要

目的:本研究旨在解决慢性病患者疾病分类问题,以支持临床决策。我们的主要目标是通过词袋法(BoW)等量化方法以及多标签分类算法,对慢性病患者病历中的多元时间序列进行多标签分类。我们的第二个目标是将监督降维技术与最先进的多标签分类算法进行比较。假设是核方法和局部保留投影使这些算法成为研究多标签医学时间序列的理想选择。 方法:我们结合BoW和监督降维算法,对慢性病患者的健康记录进行多标签分类。在两个真实世界数据集中,将所考虑的算法与最先进的多标签分类器进行比较。Portavita数据集包含525名2型糖尿病(DT2)患者,以及DT2的合并症,如高血压、血脂异常和微血管或大血管问题。MIMIC II数据集包含2635名受甲状腺疾病、糖尿病、脂质代谢疾病、液体电解质疾病、高血压疾病、血栓形成、低血压、慢性阻塞性肺疾病(COPD)、肝脏疾病和肾脏疾病影响的患者。使用汉明损失、一错误率、覆盖度、排序损失和平均精度等多标签评估指标对算法进行评估。 结果:非线性降维方法在使用BoW算法量化的医学时间序列上表现良好,结果与最先进的多标签分类算法相当。相对于纯二元相关性方法,链接投影特征对算法性能有积极影响。 结论:评估突出了使用BoW表示医疗健康记录用于多标签分类任务的可行性。该研究还强调,基于核方法、局部保留投影或两者的降维算法是处理存在许多缺失值和高标签密度的医学时间序列中的多标签分类任务的理想选择。

相似文献

[1]
Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.

J Biomed Inform. 2014-10

[2]
Performance comparison of multi-label learning algorithms on clinical data for chronic diseases.

Comput Biol Med. 2015-10-1

[3]
Building a common pipeline for rule-based document classification.

Stud Health Technol Inform. 2013

[4]
Visually defining and querying consistent multi-granular clinical temporal abstractions.

Artif Intell Med. 2011-12-15

[5]
An efficient pancreatic cyst identification methodology using natural language processing.

Stud Health Technol Inform. 2013

[6]
Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity.

Comput Methods Programs Biomed. 2020-5

[7]
Multi-objective evolutionary algorithms for fuzzy classification in survival prediction.

Artif Intell Med. 2014-1-9

[8]
Robust classifiers for data reduced via random projections.

IEEE Trans Syst Man Cybern B Cybern. 2010-10

[9]
Classification integration and reclassification using constraint databases.

Artif Intell Med. 2010-4-8

[10]
A conditional entropy minimization criterion for dimensionality reduction and multiple kernel learning.

Neural Comput. 2010-11

引用本文的文献

[1]
Identification of social determinants of health using multi-label classification of electronic health record clinical notes.

JAMIA Open. 2021-2-9

[2]
Translational Radiomics: Defining the Strategy Pipeline and Considerations for Application-Part 2: From Clinical Implementation to Enterprise.

J Am Coll Radiol. 2018-2-1

[3]
Type 2 Diabetes Patients Benefit from the COMODITY12 mHealth System: Results of a Randomised Trial.

J Med Syst. 2016-12

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索