Bromuri Stefano, Zufferey Damien, Hennebert Jean, Schumacher Michael
University of Applied Sciences Western Switzerland, Institute of Business Information Systems, TechnoArk 3, CH-3960 Sierre, Switzerland.
University of Applied Sciences Western Switzerland, Institute of Business Information Systems, TechnoArk 3, CH-3960 Sierre, Switzerland.
J Biomed Inform. 2014 Oct;51:165-75. doi: 10.1016/j.jbi.2014.05.010. Epub 2014 May 29.
OBJECTIVE: This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. METHODS: We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. RESULTS: Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. CONCLUSIONS: The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density.
目的:本研究旨在解决慢性病患者疾病分类问题,以支持临床决策。我们的主要目标是通过词袋法(BoW)等量化方法以及多标签分类算法,对慢性病患者病历中的多元时间序列进行多标签分类。我们的第二个目标是将监督降维技术与最先进的多标签分类算法进行比较。假设是核方法和局部保留投影使这些算法成为研究多标签医学时间序列的理想选择。 方法:我们结合BoW和监督降维算法,对慢性病患者的健康记录进行多标签分类。在两个真实世界数据集中,将所考虑的算法与最先进的多标签分类器进行比较。Portavita数据集包含525名2型糖尿病(DT2)患者,以及DT2的合并症,如高血压、血脂异常和微血管或大血管问题。MIMIC II数据集包含2635名受甲状腺疾病、糖尿病、脂质代谢疾病、液体电解质疾病、高血压疾病、血栓形成、低血压、慢性阻塞性肺疾病(COPD)、肝脏疾病和肾脏疾病影响的患者。使用汉明损失、一错误率、覆盖度、排序损失和平均精度等多标签评估指标对算法进行评估。 结果:非线性降维方法在使用BoW算法量化的医学时间序列上表现良好,结果与最先进的多标签分类算法相当。相对于纯二元相关性方法,链接投影特征对算法性能有积极影响。 结论:评估突出了使用BoW表示医疗健康记录用于多标签分类任务的可行性。该研究还强调,基于核方法、局部保留投影或两者的降维算法是处理存在许多缺失值和高标签密度的医学时间序列中的多标签分类任务的理想选择。
Comput Biol Med. 2015-10-1
Stud Health Technol Inform. 2013
Artif Intell Med. 2011-12-15
Stud Health Technol Inform. 2013
Comput Methods Programs Biomed. 2020-5
Artif Intell Med. 2014-1-9
IEEE Trans Syst Man Cybern B Cybern. 2010-10
Artif Intell Med. 2010-4-8