Hu Zhen, Melton Genevieve B, Arsoniadis Elliot G, Wang Yan, Kwaan Mary R, Simon Gyorgy J
Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA.
Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA; Department of Surgery, University of Minnesota, Minneapolis, MN, USA.
J Biomed Inform. 2017 Apr;68:112-120. doi: 10.1016/j.jbi.2017.03.009. Epub 2017 Mar 16.
Proper handling of missing data is important for many secondary uses of electronic health record (EHR) data. Data imputation methods can be used to handle missing data, but their use for analyzing EHR data is limited and specific efficacy for postoperative complication detection is unclear. Several data imputation methods were used to develop data models for automated detection of three types (i.e., superficial, deep, and organ space) of surgical site infection (SSI) and overall SSI using American College of Surgeons National Surgical Quality Improvement Project (NSQIP) Registry 30-day SSI occurrence data as a reference standard. Overall, models with missing data imputation almost always outperformed reference models without imputation that included only cases with complete data for detection of SSI overall achieving very good average area under the curve values. Missing data imputation appears to be an effective means for improving postoperative SSI detection using EHR clinical data.
正确处理缺失数据对于电子健康记录(EHR)数据的许多二次使用都很重要。数据插补方法可用于处理缺失数据,但其在分析EHR数据方面的应用有限,且对术后并发症检测的具体疗效尚不清楚。使用几种数据插补方法,以美国外科医师学会国家外科质量改进项目(NSQIP)登记处30天手术部位感染(SSI)发生数据作为参考标准,开发用于自动检测三种类型(即浅表、深部和器官腔隙)手术部位感染及总体SSI的数据模型。总体而言,使用缺失数据插补的模型几乎总是优于仅包含完整数据病例的未插补参考模型,在检测总体SSI方面实现了非常好的平均曲线下面积值。缺失数据插补似乎是一种利用EHR临床数据改善术后SSI检测的有效手段。