Ramljak Dusan, Davey Adam, Uversky Alexey, Roychoudhury Shoumik, Obradovic Zoran
Center for Data Analytics and Biomedical Informatics, Temple University, Philadelphia, PA, USA.
AMIA Annu Symp Proc. 2015 Nov 5;2015:1047-56. eCollection 2015.
The Hospital Readmissions Reduction Program (HRRP) introduced in October 2012 as part of the Affordable Care Act (ACA), ties hospital reimbursement rates to adjusted 30-day readmissions and mortality performance for a small set of target diagnoses. There is growing concern and emerging evidence that use of a small set of target diagnoses to establish reimbursement rates can lead to unstable results that are susceptible to manipulation (gaming) by hospitals.
We propose a novel approach to identifying co-occurring diagnoses and procedures that can themselves serve as a proxy indicator of the target diagnosis. The proposed approach constructs a Markov Blanket that allows a high level of performance, in terms of predictive accuracy and scalability, along with interpretability of obtained results. In order to scale to a large number of co-occuring diagnoses (features) and hospital discharge records (samples), our approach begins with Google's PageRank algorithm and exploits the stability of obtained results to rank the contribution of each diagnosis/procedure in terms of presence in a Markov Blanket for outcome prediction.
Presence of target diagnoses acute myocardial infarction (AMI), congestive heart failure (CHF), pneumonia (PN), and Sepsis in hospital discharge records for Medicare and Medicaid patients in California and New York state hospitals (2009-2011), were predicted using models trained on a subset of California state hospitals (2003-2008). Using repeated holdout evaluation, we used ~30,000,000 hospital discharge records and analyzed the stability of the proposed approach. Model performance was measured using the Area Under the ROC Curve (AUC) metric, and importance and contribution of single features to the final result. The results varied from AUC=0.68 (with SE<1e-4) for PN on cross validation datasets to AUC=0.94, with (SE<1e-7) for Sepsis on California hospitals (2009 - 2011), while the stability of features was consistently better with more training data for each target diagnosis. Prediction accuracy for considered target diagnoses approaches or exceeds accuracy estimates for discharge record data.
This paper presents a novel approach to identifying a small subset of relevant diagnoses and procedures that approximate the Markov Blanket for target diagnoses. Accuracy and interpretability of results demonstrate the potential of our approach.
2012年10月推出的医院再入院率降低计划(HRRP)作为《平价医疗法案》(ACA)的一部分,将医院报销率与一小部分目标诊断的调整后30天再入院率和死亡率表现挂钩。越来越多的担忧以及新出现的证据表明,使用一小部分目标诊断来确定报销率可能会导致不稳定的结果,容易受到医院的操纵(博弈)。
我们提出了一种新颖的方法来识别同时出现的诊断和程序,这些诊断和程序本身可以作为目标诊断的替代指标。所提出的方法构建了一个马尔可夫毯,在预测准确性和可扩展性方面具有较高的性能,同时具有所得结果的可解释性。为了扩展到大量同时出现的诊断(特征)和医院出院记录(样本),我们的方法从谷歌的PageRank算法开始,并利用所得结果的稳定性来对每个诊断/程序在马尔可夫毯中对结果预测的存在贡献进行排名。
使用在加利福尼亚州部分医院(2003 - 2008年)训练的模型,预测了加利福尼亚州和纽约州医院(2009 - 2011年)医疗保险和医疗补助患者出院记录中目标诊断急性心肌梗死(AMI)、充血性心力衰竭(CHF)、肺炎(PN)和脓毒症的存在情况。通过重复留出评估,我们使用了约3000万份医院出院记录,并分析了所提出方法的稳定性。使用ROC曲线下面积(AUC)指标测量模型性能,以及单个特征对最终结果的重要性和贡献。结果从交叉验证数据集上PN的AUC = 0.68(标准误差<1e - 4)到加利福尼亚州医院(2009 - 2011年)脓毒症的AUC = 0.94(标准误差<1e - 7)不等,而对于每个目标诊断,随着更多训练数据,特征的稳定性始终更好。所考虑目标诊断的预测准确性接近或超过出院记录数据的准确性估计。
本文提出了一种新颖的方法来识别一小部分相关诊断和程序,这些诊断和程序近似于目标诊断的马尔可夫毯。结果的准确性和可解释性证明了我们方法的潜力。