Rubinfeld Ilan, Farooq Maria, Velanovich Vic, Syed Zeeshan
Henry Ford Health System, Detroit, MI;
AMIA Annu Symp Proc. 2010 Nov 13;2010:777-81.
As medicine becomes increasingly data driven, caregivers are required to collect and analyze an increasingly copious volume of patient data. Although methods for studying these data have recently evolved, the collection of clinically validated data remains cumbersome. We explored how to reduce the amount of data needed to risk stratify patients. We focused our investigation on patient data from the National Surgical Quality Improvement Program (NSQIP) to study how the accuracy of predictive models may be affected by changing the number of variables, the categories of variables, and the times at which these variables were collected. By examining the implications of creating predictive models based on the entire variable set in NSQIP and smaller selected variable groups, our results show that using far fewer variables than traditionally done can lead to similar predictive accuracy.
随着医学越来越数据驱动,医护人员需要收集和分析数量日益庞大的患者数据。尽管研究这些数据的方法最近有所发展,但临床验证数据的收集仍然繁琐。我们探索了如何减少对患者进行风险分层所需的数据量。我们将调查重点放在国家外科质量改进计划(NSQIP)的患者数据上,以研究预测模型的准确性如何受到变量数量、变量类别以及收集这些变量的时间的变化影响。通过检查基于NSQIP中的整个变量集和较小的选定变量组创建预测模型的影响,我们的结果表明,使用比传统方法少得多的变量可以导致相似的预测准确性。