• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TASTE:用于电子健康记录表型分析的时间和静态张量分解

TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records.

作者信息

Afshar Ardavan, Perros Ioakeim, Park Haesun, deFilippi Christopher, Yan Xiaowei, Stewart Walter, Ho Joyce, Sun Jimeng

机构信息

Georgia Institute of Technology.

HEALTH[at]SCALE.

出版信息

Proc ACM Conf Health Inference Learn (2020). 2020 Apr;2020:193-203. doi: 10.1145/3368555.3384464.

DOI:10.1145/3368555.3384464
PMID:33659966
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7924914/
Abstract

focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool for phenotyping. Most of the existing works assume either a static patient representation with aggregate data or only model temporal data. However, real EHR data contain both temporal (e.g., longitudinal clinical visits) and static information (e.g., patient demographics), which are difficult to model simultaneously. In this paper, we propose emporal nd tatic nsor factorization (TASTE) that jointly models both static and temporal information to extract phenotypes. TASTE combines the PARAFAC2 model with non-negative matrix factorization to model a temporal and a static tensor. To fit the proposed model, we transform the original problem into simpler ones which are optimally solved in an alternating fashion. For each of the sub-problems, our proposed mathematical re-formulations lead to efficient sub-problem solvers. Comprehensive experiments on large EHR data from a heart failure (HF) study confirmed that TASTE is up to 14× faster than several baselines and the resulting phenotypes were confirmed to be clinically meaningful by a cardiologist. Using 60 phenotypes extracted by TASTE, a simple logistic regression can achieve the same level of area under the curve (AUC) for HF prediction compared to a deep learning model using recurrent neural networks (RNN) with 345 features.

摘要

专注于定义有意义的患者群体(例如,心力衰竭组和糖尿病组)并识别这些群体中患者的时间演变。张量分解一直是用于表型分析的有效工具。大多数现有工作要么假设具有聚合数据的静态患者表示,要么仅对时间数据进行建模。然而,真实的电子健康记录(EHR)数据包含时间信息(例如,纵向临床就诊)和静态信息(例如,患者人口统计学信息),这很难同时进行建模。在本文中,我们提出了时间与静态张量分解(TASTE)方法,该方法联合对静态和时间信息进行建模以提取表型。TASTE将PARAFAC2模型与非负矩阵分解相结合,以对时间张量和静态张量进行建模。为了拟合所提出的模型,我们将原始问题转化为更简单的问题,并以交替方式对其进行最优求解。对于每个子问题,我们提出的数学重新表述方法会产生高效的子问题求解器。对来自心力衰竭(HF)研究的大型EHR数据进行的综合实验证实,TASTE比几个基线方法快14倍,并且心脏病专家确认所得到的表型具有临床意义。使用TASTE提取的60种表型,与使用具有345个特征的递归神经网络(RNN)的深度学习模型相比,简单的逻辑回归在预测HF时可以达到相同的曲线下面积(AUC)水平。

相似文献

1
TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records.TASTE:用于电子健康记录表型分析的时间和静态张量分解
Proc ACM Conf Health Inference Learn (2020). 2020 Apr;2020:193-203. doi: 10.1145/3368555.3384464.
2
Temporal phenotyping of medically complex children via PARAFAC2 tensor factorization.通过 PARAFAC2 张量分解对医学上复杂的儿童进行时间表型分析。
J Biomed Inform. 2019 May;93:103125. doi: 10.1016/j.jbi.2019.103125. Epub 2019 Feb 8.
3
Communication Efficient Tensor Factorization for Decentralized Healthcare Networks.用于分散式医疗网络的通信高效张量分解
Proc IEEE Int Conf Data Min. 2021 Dec;2021:1216-1221. doi: 10.1109/icdm51629.2021.00147. Epub 2022 Jan 24.
4
LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values.LogPar:用于处理带有缺失值的时态二元数据的逻辑PARAFAC2分解
KDD. 2020 Aug;2020:1625-1635. doi: 10.1145/3394486.3403213.
5
Limestone: high-throughput candidate phenotype generation via tensor factorization.石灰岩:通过张量分解进行高通量候选表型生成。
J Biomed Inform. 2014 Dec;52:199-211. doi: 10.1016/j.jbi.2014.07.001. Epub 2014 Jul 16.
6
Rubik: Knowledge Guided Tensor Factorization and Completion for Health Data Analytics.鲁比克:用于健康数据分析的知识引导张量分解与补全
KDD. 2015 Aug;2015:1265-1274. doi: 10.1145/2783258.2783395.
7
SUSTain: Scalable Unsupervised Scoring for Tensors and its Application to Phenotyping.SUSTain:张量的可扩展无监督评分及其在表型分析中的应用。
KDD. 2018 Jul;2018:2080-2089. doi: 10.1145/3219819.3219999.
8
COPA: Constrained PARAFAC2 for Sparse & Large Datasets.COPA:用于稀疏和大型数据集的约束PARAFAC2
Proc ACM Int Conf Inf Knowl Manag. 2018 Oct;2018:793-802. doi: 10.1145/3269206.3271775.
9
Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics.用于协作式健康数据分析的通信高效联邦广义张量分解
Proc Int World Wide Web Conf. 2021 Apr;2021:171-182. doi: 10.1145/3442381.3449832.
10
Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis.用于协作式健康数据分析的隐私保护张量分解
Proc ACM Int Conf Inf Knowl Manag. 2019 Nov;2019:1291-1300. doi: 10.1145/3357384.3357878.

引用本文的文献

1
Longitudinal Metabolomics Data Analysis Informed by Mechanistic Models.基于机理模型的纵向代谢组学数据分析
Metabolites. 2024 Dec 24;15(1):2. doi: 10.3390/metabo15010002.
2
MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning for Computational Phenotyping.MULTIPAR:用于计算表型分析的多任务学习监督不规则张量分解
Proc Mach Learn Res. 2023 Dec;225:498-511.
3
Creating High-Quality Synthetic Health Data: Framework for Model Development and Validation.创建高质量合成健康数据:模型开发与验证框架。
JMIR Form Res. 2024 Apr 22;8:e53241. doi: 10.2196/53241.
4
Improving Diagnostics with Deep Forest Applied to Electronic Health Records.深度学习森林在电子健康记录中的应用提高诊断能力。
Sensors (Basel). 2023 Jul 21;23(14):6571. doi: 10.3390/s23146571.
5
LogPar: Logistic PARAFAC2 Factorization for Temporal Binary Data with Missing Values.LogPar:用于处理带有缺失值的时态二元数据的逻辑PARAFAC2分解
KDD. 2020 Aug;2020:1625-1635. doi: 10.1145/3394486.3403213.
6
Untangling the complexity of multimorbidity with machine learning.运用机器学习厘清多种共病的复杂性。
Mech Ageing Dev. 2020 Sep;190:111325. doi: 10.1016/j.mad.2020.111325. Epub 2020 Aug 6.

本文引用的文献

1
DDL: Deep Dictionary Learning for Predictive Phenotyping.DDL:用于预测性表型分析的深度字典学习
IJCAI (U S). 2019 Aug;2019:5857-5863. doi: 10.24963/ijcai.2019/812.
2
SUSTain: Scalable Unsupervised Scoring for Tensors and its Application to Phenotyping.SUSTain:张量的可扩展无监督评分及其在表型分析中的应用。
KDD. 2018 Jul;2018:2080-2089. doi: 10.1145/3219819.3219999.
3
COPA: Constrained PARAFAC2 for Sparse & Large Datasets.COPA:用于稀疏和大型数据集的约束PARAFAC2
Proc ACM Int Conf Inf Knowl Manag. 2018 Oct;2018:793-802. doi: 10.1145/3269206.3271775.
4
Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study.基于电子健康记录的张量分解检测时变表型主题:心血管疾病案例研究。
J Biomed Inform. 2019 Oct;98:103270. doi: 10.1016/j.jbi.2019.103270. Epub 2019 Aug 22.
5
S3CMTF: Fast, accurate, and scalable method for incomplete coupled matrix-tensor factorization.S3CMTF:一种快速、准确且可扩展的不完全耦合矩阵-张量分解方法。
PLoS One. 2019 Jun 28;14(6):e0217316. doi: 10.1371/journal.pone.0217316. eCollection 2019.
6
Using topic modeling via non-negative matrix factorization to identify relationships between genetic variants and disease phenotypes: A case study of Lipoprotein(a) (LPA).利用非负矩阵分解的主题建模来识别遗传变异与疾病表型之间的关系:脂蛋白(a)(LPA)的案例研究。
PLoS One. 2019 Feb 13;14(2):e0212112. doi: 10.1371/journal.pone.0212112. eCollection 2019.
7
Unsupervised Discovery of Demixed, Low-Dimensional Neural Dynamics across Multiple Timescales through Tensor Component Analysis.通过张量成分分析,在多个时间尺度上对混合的、低维神经动力学进行无监督发现。
Neuron. 2018 Jun 27;98(6):1099-1115.e8. doi: 10.1016/j.neuron.2018.05.015. Epub 2018 Jun 7.
8
Using recurrent neural network models for early detection of heart failure onset.使用循环神经网络模型进行心力衰竭发作的早期检测。
J Am Med Inform Assoc. 2017 Mar 1;24(2):361-370. doi: 10.1093/jamia/ocw112.
9
Clinical phenotyping in selected national networks: demonstrating the need for high-throughput, portable, and computational methods.特定国家网络中的临床表型分析:证明对高通量、便携式和计算方法的需求。
Artif Intell Med. 2016 Jul;71:57-61. doi: 10.1016/j.artmed.2016.05.005. Epub 2016 Jun 25.
10
Limestone: high-throughput candidate phenotype generation via tensor factorization.石灰岩:通过张量分解进行高通量候选表型生成。
J Biomed Inform. 2014 Dec;52:199-211. doi: 10.1016/j.jbi.2014.07.001. Epub 2014 Jul 16.