Suppr超能文献

MULTIPAR:用于计算表型分析的多任务学习监督不规则张量分解

MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning for Computational Phenotyping.

作者信息

Ren Yifei, Lou Jian, Xiong Li, Ho Joyce C, Jiang Xiaoqian, Bhavani Sivasubramanium Venkatraman

机构信息

Emory University, United States.

Zhejiang University, China.

出版信息

Proc Mach Learn Res. 2023 Dec;225:498-511.

Abstract

Tensor factorization has received increasing interest due to its intrinsic ability to capture latent factors in multi-dimensional data with many applications including Electronic Health Records (EHR) mining. PARAFAC2 and its variants have been proposed to address irregular tensors where one of the tensor modes is not aligned, e.g., different patients in EHRs may have different length of records. PARAFAC2 has been successfully applied to EHRs for extracting meaningful medical concepts (phenotypes). Despite recent advancements, current models' predictability and interpretability are not satisfactory, which limits its utility for downstream analysis. In this paper, we propose MULTIPAR: a supervised irregular tensor factorization with multi-task learning for computational phenotyping. MULTIPAR is flexible to incorporate both static (e.g. in-hospital mortality prediction) and continuous or dynamic (e.g. the need for ventilation) tasks. By supervising the tensor factorization with downstream prediction tasks and leveraging information from multiple related predictive tasks, MULTIPAR can yield not only more meaningful phenotypes but also better predictive performance for downstream tasks. We conduct extensive experiments on two real-world temporal EHR datasets to demonstrate that MULTIPAR is scalable and achieves better tensor fit with more meaningful subgroups and stronger predictive performance compared to existing state-of-the-art methods. The implementation of MULTIPAR is available.

摘要

张量分解因其能够捕捉多维数据中的潜在因素而受到越来越多的关注,其应用广泛,包括电子健康记录(EHR)挖掘。PARAFAC2及其变体已被提出用于处理不规则张量,其中张量模式之一未对齐,例如,EHR中的不同患者可能有不同长度的记录。PARAFAC2已成功应用于EHR,以提取有意义的医学概念(表型)。尽管最近取得了进展,但当前模型的可预测性和可解释性并不令人满意,这限制了其在下游分析中的效用。在本文中,我们提出了MULTIPAR:一种用于计算表型分析的具有多任务学习的监督不规则张量分解方法。MULTIPAR可以灵活地纳入静态任务(例如院内死亡率预测)和连续或动态任务(例如通气需求)。通过用下游预测任务监督张量分解并利用来自多个相关预测任务的信息,MULTIPAR不仅可以产生更有意义的表型,还可以为下游任务提供更好的预测性能。我们在两个真实世界的时间EHR数据集上进行了广泛的实验,以证明MULTIPAR具有可扩展性,与现有的最先进方法相比,它能实现更好的张量拟合,产生更有意义的子组,并具有更强的预测性能。MULTIPAR的实现是可用的。

相似文献

本文引用的文献

6
COPA: Constrained PARAFAC2 for Sparse & Large Datasets.COPA:用于稀疏和大型数据集的约束PARAFAC2
Proc ACM Int Conf Inf Knowl Manag. 2018 Oct;2018:793-802. doi: 10.1145/3269206.3271775.
7
CovidSens: a vision on reliable social sensing for COVID-19.CovidSens:关于COVID-19可靠社会感知的愿景。
Artif Intell Rev. 2021;54(1):1-25. doi: 10.1007/s10462-020-09852-3. Epub 2020 Jun 12.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验