Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, UT, United States.
Department of Information Systems, WP Carey School of Business, Arizona State University, Phoenix, AZ, United States.
J Med Internet Res. 2021 Feb 12;23(2):e18372. doi: 10.2196/18372.
BACKGROUND: Acute diseases present severe complications that develop rapidly, exhibit distinct phenotypes, and have profound effects on patient outcomes. Predictive analytics can enhance physicians' care and management of patients with acute diseases by predicting crucial complication phenotypes for a timely diagnosis and treatment. However, effective phenotype predictions require several challenges to be overcome. First, patient data collected in the early stages of an acute disease (eg, clinical data and laboratory results) are less informative for predicting phenotypic outcomes. Second, patient data are temporal and heterogeneous; for example, patients receive laboratory tests at different time intervals and frequencies. Third, imbalanced distributions of patient outcomes create additional complexity for predicting complication phenotypes. OBJECTIVE: To predict crucial complication phenotypes among patients with acute diseases, we propose a novel, deep learning-based method that uses recurrent neural network-based sequence embedding to represent disease progression while considering temporal heterogeneities in patient data. Our method incorporates a latent regulator to alleviate data insufficiency constraints by accounting for the underlying mechanisms that are not observed in patient data. The proposed method also includes cost-sensitive learning to address imbalanced outcome distributions in patient data for improved predictions. METHODS: From a major health care organization in Taiwan, we obtained a sample of 10,354 electronic health records that pertained to 6545 patients with peritonitis. The proposed method projects these temporal, heterogeneous, and clinical data into a substantially reduced feature space and then incorporates a latent regulator (latent parameter matrix) to obviate data insufficiencies and account for variations in phenotypic expressions. Moreover, our method employs cost-sensitive learning to further increase the predictive performance. RESULTS: We evaluated the efficacy of the proposed method for predicting two hepatic complication phenotypes in patients with peritonitis: acute hepatic encephalopathy and hepatorenal syndrome. The following three benchmark techniques were evaluated: temporal multiple measurement case-based reasoning (MMCBR), temporal short long-term memory (T-SLTM) networks, and time fusion convolutional neural network (CNN). For acute hepatic encephalopathy predictions, our method attained an area under the curve (AUC) value of 0.82, which outperforms temporal MMCBR by 64%, T-SLTM by 26%, and time fusion CNN by 26%. For hepatorenal syndrome predictions, our method achieved an AUC value of 0.64, which is 29% better than that of temporal MMCBR (0.54). Overall, the evaluation results show that the proposed method significantly outperforms all the benchmarks, as measured by recall, F-measure, and AUC while maintaining comparable precision values. CONCLUSIONS: The proposed method learns a short-term temporal representation from patient data to predict complication phenotypes and offers greater predictive utilities than prevalent data-driven techniques. This method is generalizable and can be applied to different acute disease (illness) scenarios that are characterized by insufficient patient clinical data availability, temporal heterogeneities, and imbalanced distributions of important patient outcomes.
背景:急性病会引发严重的并发症,这些并发症迅速出现,表现出明显的表型,对患者的预后有深远的影响。预测分析可以通过预测关键的并发症表型来提高医生对急性病患者的护理和管理水平,从而实现及时的诊断和治疗。然而,有效的表型预测需要克服几个挑战。首先,急性病早期(如临床数据和实验室结果)采集的患者数据对于预测表型结局的信息量较少。其次,患者数据具有时间和异质性;例如,患者在不同的时间间隔和频率接受实验室检查。第三,患者结局的不平衡分布为预测并发症表型增加了额外的复杂性。
目的:为了预测急性病患者的关键并发症表型,我们提出了一种新的基于深度学习的方法,该方法使用基于递归神经网络的序列嵌入来表示疾病进展,同时考虑患者数据中的时间异质性。我们的方法采用了潜在调节器,通过考虑患者数据中未观察到的潜在机制来缓解数据不足的限制。该方法还包括基于代价敏感的学习,以解决患者数据中结局不平衡的分布问题,从而提高预测效果。
方法:我们从台湾的一家主要医疗机构中获取了一个包含 6545 名腹膜炎患者的 10354 份电子健康记录样本。该方法将这些时间上、异质的和临床数据投射到一个大大减少的特征空间中,然后采用潜在调节器(潜在参数矩阵)来避免数据不足并解释表型表达的变化。此外,我们的方法采用基于代价敏感的学习来进一步提高预测性能。
结果:我们评估了该方法在预测腹膜炎患者两种肝脏并发症表型(急性肝性脑病和肝肾综合征)中的疗效。评估了以下三种基准技术:时间多测量基于案例的推理(MMCBR)、时间短长期记忆(T-SLTM)网络和时间融合卷积神经网络(CNN)。对于急性肝性脑病的预测,我们的方法的曲线下面积(AUC)值为 0.82,比时间 MMCBR 高 64%,比 T-SLTM 高 26%,比时间融合 CNN 高 26%。对于肝肾综合征的预测,我们的方法的 AUC 值为 0.64,比时间 MMCBR(0.54)高 29%。总体而言,评估结果表明,与基于数据的流行技术相比,该方法在召回率、F 度量和 AUC 方面显著提高了预测性能,同时保持了可比的精度值。
结论:该方法从患者数据中学习短期的时间表示,以预测并发症表型,并提供比现有数据驱动技术更高的预测效用。该方法具有通用性,可应用于不同的急性病(疾病)场景,这些场景的特点是患者临床数据可用性不足、时间异质性和重要患者结局分布不平衡。
J Am Med Inform Assoc. 2023-4-19
IEEE J Biomed Health Inform. 2021-6
J Med Internet Res. 2020-9-28
BMC Bioinformatics. 2023-3-23
BMC Med Inform Decis Mak. 2020-10-29
Bioengineering (Basel). 2025-5-14
Acta Odontol Scand. 2025-3-27
AMIA Jt Summits Transl Sci Proc. 2023-6-16
J Healthc Inform Res. 2019-12-13
Brief Bioinform. 2021-3-22
Ther Clin Risk Manag. 2019-11-27
Pharmacoepidemiol Drug Saf. 2019-1-16
Nat Med. 2019-1-7
Nat Rev Dis Primers. 2018-10-15
Expert Rev Mol Diagn. 2018-2-16
Sci Rep. 2017-7-11