Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Fuenlabrada, Spain.
Department of Computer Science & Statistics, Rey Juan Carlos University, Fuenlabrada, Spain.
BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):92. doi: 10.1186/s12859-020-3359-3.
Chronic diseases are becoming more widespread each year in developed countries, mainly due to increasing life expectancy. Among them, diabetes mellitus (DM) and essential hypertension (EH) are two of the most prevalent ones. Furthermore, they can be the onset of other chronic conditions such as kidney or obstructive pulmonary diseases. The need to comprehend the factors related to such complex diseases motivates the development of interpretative and visual analysis methods, such as classification trees, which not only provide predictive models for diagnosing patients, but can also help to discover new clinical insights.
In this paper, we analyzed healthy and chronic (diabetic, hypertensive) patients associated with the University Hospital of Fuenlabrada in Spain. Each patient was classified into a single health status according to clinical risk groups (CRGs). The CRGs characterize a patient through features such as age, gender, diagnosis codes, and drug codes. Based on these features and the CRGs, we have designed classification trees to determine the most discriminative decision features among different health statuses. In particular, we propose to make use of statistical data visualizations to guide the selection of features in each node when constructing a tree. We created several classification trees to distinguish among patients with different health statuses. We analyzed their performance in terms of classification accuracy, and drew clinical conclusions regarding the decision features considered in each tree. As expected, healthy patients and patients with a single chronic condition were better classified than patients with comorbidities. The constructed classification trees also show that the use of antipsychotics and the diagnosis of chronic airway obstruction are relevant for classifying patients with more than one chronic condition, in conjunction with the usual DM and/or EH diagnoses.
We propose a methodology for constructing classification trees in a visually guided manner. The approach allows clinicians to progressively select the decision features at each of the tree nodes. The process is guided by exploratory data analysis visualizations, which may provide new insights and unexpected clinical information.
在发达国家,慢性病的发病率逐年上升,主要是由于预期寿命的延长。其中,糖尿病(DM)和原发性高血压(EH)是最常见的两种疾病。此外,它们可能是其他慢性疾病的发病原因,如肾脏或阻塞性肺疾病。理解这些复杂疾病相关因素的需求促使了解释性和可视化分析方法的发展,如分类树,它不仅为诊断患者提供预测模型,还可以帮助发现新的临床见解。
本文分析了西班牙富恩拉夫拉达大学医院的健康和慢性(糖尿病、高血压)患者。每个患者根据临床风险组(CRG)被分类为单一的健康状况。CRG 通过年龄、性别、诊断代码和药物代码等特征来描述患者。基于这些特征和 CRG,我们设计了分类树来确定不同健康状况下最具判别力的决策特征。特别是,我们建议利用统计数据可视化来指导在构建树时选择每个节点的特征。我们创建了几个分类树来区分不同健康状况的患者。我们根据分类准确性分析它们的性能,并根据每个树中考虑的决策特征得出临床结论。正如预期的那样,健康患者和单一慢性疾病患者的分类效果优于合并症患者。构建的分类树还表明,与通常的 DM 和/或 EH 诊断相结合,使用抗精神病药物和诊断慢性气道阻塞对于分类患有多种慢性疾病的患者是相关的。
我们提出了一种以可视化方式构建分类树的方法。该方法允许临床医生在树的每个节点逐步选择决策特征。该过程由探索性数据分析可视化指导,这可能提供新的见解和意外的临床信息。