Alramadeen Wesam, Ding Yu, Costa Carlos, Si Bing
Department of Systems Science and Industrial Engineering, State University of New York at Binghamton, Binghamton, NY, USA 13902, USA.
IBM T. J. Watson Research Center, Yorktown Heights, NY 10510, USA.
IISE Trans Healthc Syst Eng. 2023;13(3):215-225. doi: 10.1080/24725579.2023.2202877. Epub 2023 Apr 26.
Digital health and telemonitoring have resulted in a wealth of information to be collected to monitor, manage, and improve human health. The multi-source mixed-frequency health data overwhelm the modeling capacity of existing statistical and machine learning models, due to many challenging properties. Although predictive analytics for big health data plays an important role in telemonitoring, there is a lack of rigorous prediction model that can automatically predicts patients' health conditions, e.g., Disease Severity Indicators (DSIs), from multi-source mixed-frequency data. Sleep disorder is a prevalent cardiac syndrome that is characterized by abnormal respiratory patterns during sleep. Although wearable devices are available to administrate sleep studies at home, the manual scoring process to generate the DSI remains a bottleneck in automated monitoring and diagnosis of sleep disorder. To address the multi-fold challenges for precise prediction of the DSI from high-dimensional multi-source mixed-frequency data in sleep disorder, we propose a sparse linear mixed model that combines the modified Cholesky decomposition with group lasso penalties to enable joint group selection of fixed effects and random effects. A novel Expectation Maximization (EM) algorithm integrated with an efficient Majorization Maximization (MM) algorithm is developed for model estimation of the proposed sparse linear mixed model with group variable selection. The proposed method was applied to the SHHS data for telemonitoring and diagnosis of sleep disorder and found that a few significant feature groups that are consistent with prior medical studies on sleep disorder. The proposed method also outperformed a few benchmark methods with the highest prediction accuracy.
数字健康和远程监测产生了大量用于监测、管理和改善人类健康的信息。由于具有许多挑战性的特性,多源混合频率健康数据超出了现有统计和机器学习模型的建模能力。尽管针对大健康数据的预测分析在远程监测中发挥着重要作用,但缺乏能够从多源混合频率数据中自动预测患者健康状况(如疾病严重程度指标)的严格预测模型。睡眠障碍是一种常见的心脏综合征,其特征是睡眠期间呼吸模式异常。虽然可穿戴设备可用于在家中进行睡眠研究,但生成疾病严重程度指标的人工评分过程仍然是睡眠障碍自动监测和诊断的瓶颈。为了解决从睡眠障碍中的高维多源混合频率数据精确预测疾病严重程度指标的多重挑战,我们提出了一种稀疏线性混合模型,该模型将改进的乔列斯基分解与组套索惩罚相结合,以实现固定效应和随机效应的联合组选择。开发了一种与高效的主元最大化算法集成的新颖期望最大化算法,用于具有组变量选择的所提出的稀疏线性混合模型的模型估计。将所提出的方法应用于用于睡眠障碍远程监测和诊断的睡眠心脏健康研究数据,发现了一些与先前关于睡眠障碍的医学研究一致的重要特征组。所提出的方法在预测准确性方面也优于一些基准方法。