Suppr超能文献

深度学习在缺失值的多变量临床患者轨迹聚类中的应用。

Deep learning for clustering of multivariate clinical patient trajectories with missing values.

机构信息

UCB Biosciences GmbH, Alfred-Nobel-Strasse 10, 40789 Monheim, Germany.

Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, Konrad-Adenauer-Strasse, 53754 Sankt Augustin, Germany.

出版信息

Gigascience. 2019 Nov 1;8(11). doi: 10.1093/gigascience/giz134.

Abstract

BACKGROUND

Precision medicine requires a stratification of patients by disease presentation that is sufficiently informative to allow for selecting treatments on a per-patient basis. For many diseases, such as neurological disorders, this stratification problem translates into a complex problem of clustering multivariate and relatively short time series because (i) these diseases are multifactorial and not well described by single clinical outcome variables and (ii) disease progression needs to be monitored over time. Additionally, clinical data often additionally are hindered by the presence of many missing values, further complicating any clustering attempts.

FINDINGS

The problem of clustering multivariate short time series with many missing values is generally not well addressed in the literature. In this work, we propose a deep learning-based method to address this issue, variational deep embedding with recurrence (VaDER). VaDER relies on a Gaussian mixture variational autoencoder framework, which is further extended to (i) model multivariate time series and (ii) directly deal with missing values. We validated VaDER by accurately recovering clusters from simulated and benchmark data with known ground truth clustering, while varying the degree of missingness. We then used VaDER to successfully stratify patients with Alzheimer disease and patients with Parkinson disease into subgroups characterized by clinically divergent disease progression profiles. Additional analyses demonstrated that these clinical differences reflected known underlying aspects of Alzheimer disease and Parkinson disease.

CONCLUSIONS

We believe our results show that VaDER can be of great value for future efforts in patient stratification, and multivariate time-series clustering in general.

摘要

背景

精准医学需要根据疾病表现对患者进行分层,这种分层方式足够详细,可以根据每位患者的情况选择治疗方法。对于许多疾病(如神经紊乱),这种分层问题转化为一个复杂的聚类问题,需要对多变量和相对较短的时间序列进行聚类,原因如下:(i)这些疾病是多因素的,不能仅用单一的临床结果变量来描述;(ii)疾病的进展需要随时间进行监测。此外,临床数据通常还受到大量缺失值的影响,这进一步增加了聚类的难度。

结果

在文献中,多变量短时间序列且存在大量缺失值的聚类问题通常未得到很好的解决。在这项工作中,我们提出了一种基于深度学习的方法来解决这个问题,即变分深度嵌入递归(VaDER)。VaDER 依赖于高斯混合变分自编码器框架,进一步扩展为:(i)对多变量时间序列进行建模;(ii)直接处理缺失值。我们通过准确地从具有已知真实聚类的模拟数据和基准数据中恢复聚类来验证 VaDER,同时改变缺失程度。然后,我们使用 VaDER 成功地将阿尔茨海默病患者和帕金森病患者分层为具有临床差异的疾病进展特征的亚组。进一步的分析表明,这些临床差异反映了阿尔茨海默病和帕金森病的已知潜在方面。

结论

我们相信,我们的结果表明 VaDER 对于未来的患者分层以及一般的多变量时间序列聚类工作具有重要价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f460/6857688/021073e16fc2/giz134fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验