Philips Research North America, Cambridge, MA, 02141, USA.
Department of Biostatistics and Data Science, The University of Texas Health Science Center at Houston, Houston, 77030, TX, USA.
BMC Med Inform Decis Mak. 2019 Jan 31;19(Suppl 1):18. doi: 10.1186/s12911-019-0734-y.
Congestive heart failure is one of the most common reasons those aged 65 and over are hospitalized in the United States, which has caused a considerable economic burden. The precise prediction of hospitalization caused by congestive heart failure in the near future could prevent possible hospitalization, optimize the medical resources, and better meet the healthcare needs of patients.
To fully utilize the monthly-updated claim feed data released by The Centers for Medicare and Medicaid Services (CMS), we present a dynamic random survival forest model adapted for periodically updated data to predict the risk of adverse events. We apply our model to dynamically predict the risk of hospital admission among patients with congestive heart failure identified using the Accountable Care Organization Operational System Claim and Claim Line Feed data from Feb 2014 to Sep 2015. We benchmark the proposed model with two commonly used models in medical application literature: the cox proportional model and logistic regression model with L-1 norm penalty.
Results show that our model has high Area-Under-the-ROC-Curve across time points and C-statistics. In addition to the high performance, it provides measures of variable importance and individual-level instant risk.
We present an efficient model adapted for periodically updated data such as the monthly updated claim feed data released by CMS to predict the risk of hospitalization. In addition to processing big-volume periodically updated stream-like data, our model can capture event onset information and time-to-event information, incorporate time-varying features, provide insights of variable importance and have good prediction power. To the best of our knowledge, it is the first work combining sliding window technique with the random survival forest model. The model achieves remarkable performance and could be easily deployed to monitor patients in real time.
充血性心力衰竭是美国 65 岁及以上人群住院的最常见原因之一,这给美国造成了相当大的经济负担。准确预测近期充血性心力衰竭导致的住院,可以防止可能的住院,优化医疗资源,并更好地满足患者的医疗需求。
为了充分利用美国医疗保险和医疗补助服务中心(CMS)发布的每月更新的理赔数据,我们提出了一种适用于定期更新数据的动态随机生存森林模型,以预测不良事件的风险。我们应用该模型对使用责任医疗组织运营系统理赔和理赔行数据(2014 年 2 月至 2015 年 9 月)识别的充血性心力衰竭患者进行动态预测住院风险。我们将所提出的模型与医疗应用文献中常用的两种模型进行了基准测试:Cox 比例风险模型和具有 L-1 正则化惩罚的逻辑回归模型。
结果表明,我们的模型在各个时间点的ROC 曲线下面积和 C 统计量都很高。除了性能高之外,它还提供了变量重要性和个体即时风险的度量。
我们提出了一种适用于定期更新数据的高效模型,例如 CMS 发布的每月更新的理赔数据,以预测住院风险。除了处理大容量定期更新的流数据外,我们的模型还可以捕获事件发生信息和事件到时间信息,结合时变特征,提供变量重要性的见解,并具有良好的预测能力。据我们所知,这是首次将滑动窗口技术与随机生存森林模型结合使用。该模型实现了显著的性能,并且可以轻松地实时监测患者。