Suppr超能文献

TASTE:用于电子健康记录表型分析的时间和静态张量分解

TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records.

作者信息

Afshar Ardavan, Perros Ioakeim, Park Haesun, deFilippi Christopher, Yan Xiaowei, Stewart Walter, Ho Joyce, Sun Jimeng

机构信息

Georgia Institute of Technology.

HEALTH[at]SCALE.

出版信息

Proc ACM Conf Health Inference Learn (2020). 2020 Apr;2020:193-203. doi: 10.1145/3368555.3384464.

Abstract

focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool for phenotyping. Most of the existing works assume either a static patient representation with aggregate data or only model temporal data. However, real EHR data contain both temporal (e.g., longitudinal clinical visits) and static information (e.g., patient demographics), which are difficult to model simultaneously. In this paper, we propose emporal nd tatic nsor factorization (TASTE) that jointly models both static and temporal information to extract phenotypes. TASTE combines the PARAFAC2 model with non-negative matrix factorization to model a temporal and a static tensor. To fit the proposed model, we transform the original problem into simpler ones which are optimally solved in an alternating fashion. For each of the sub-problems, our proposed mathematical re-formulations lead to efficient sub-problem solvers. Comprehensive experiments on large EHR data from a heart failure (HF) study confirmed that TASTE is up to 14× faster than several baselines and the resulting phenotypes were confirmed to be clinically meaningful by a cardiologist. Using 60 phenotypes extracted by TASTE, a simple logistic regression can achieve the same level of area under the curve (AUC) for HF prediction compared to a deep learning model using recurrent neural networks (RNN) with 345 features.

摘要

专注于定义有意义的患者群体(例如,心力衰竭组和糖尿病组)并识别这些群体中患者的时间演变。张量分解一直是用于表型分析的有效工具。大多数现有工作要么假设具有聚合数据的静态患者表示,要么仅对时间数据进行建模。然而,真实的电子健康记录(EHR)数据包含时间信息(例如,纵向临床就诊)和静态信息(例如,患者人口统计学信息),这很难同时进行建模。在本文中,我们提出了时间与静态张量分解(TASTE)方法,该方法联合对静态和时间信息进行建模以提取表型。TASTE将PARAFAC2模型与非负矩阵分解相结合,以对时间张量和静态张量进行建模。为了拟合所提出的模型,我们将原始问题转化为更简单的问题,并以交替方式对其进行最优求解。对于每个子问题,我们提出的数学重新表述方法会产生高效的子问题求解器。对来自心力衰竭(HF)研究的大型EHR数据进行的综合实验证实,TASTE比几个基线方法快14倍,并且心脏病专家确认所得到的表型具有临床意义。使用TASTE提取的60种表型,与使用具有345个特征的递归神经网络(RNN)的深度学习模型相比,简单的逻辑回归在预测HF时可以达到相同的曲线下面积(AUC)水平。

相似文献

3
Communication Efficient Tensor Factorization for Decentralized Healthcare Networks.用于分散式医疗网络的通信高效张量分解
Proc IEEE Int Conf Data Min. 2021 Dec;2021:1216-1221. doi: 10.1109/icdm51629.2021.00147. Epub 2022 Jan 24.
8
COPA: Constrained PARAFAC2 for Sparse & Large Datasets.COPA:用于稀疏和大型数据集的约束PARAFAC2
Proc ACM Int Conf Inf Knowl Manag. 2018 Oct;2018:793-802. doi: 10.1145/3269206.3271775.

本文引用的文献

3
COPA: Constrained PARAFAC2 for Sparse & Large Datasets.COPA:用于稀疏和大型数据集的约束PARAFAC2
Proc ACM Int Conf Inf Knowl Manag. 2018 Oct;2018:793-802. doi: 10.1145/3269206.3271775.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验