Hravnak Marilyn, Chen Lujie, Dubrawski Artur, Bose Eliezer, Clermont Gilles, Pinsky Michael R
Department of Acute and Tertiary Care, University of Pittsburgh Schools of Nursing, 336 Victoria Hall; 3500 Victoria St., Pittsburgh, PA, 15261, USA.
Carnegie Mellon University Robotics Institute (Auton Lab), Pittsburgh, PA, USA.
J Clin Monit Comput. 2016 Dec;30(6):875-888. doi: 10.1007/s10877-015-9788-2. Epub 2015 Oct 5.
Huge hospital information system databases can be mined for knowledge discovery and decision support, but artifact in stored non-invasive vital sign (VS) high-frequency data streams limits its use. We used machine-learning (ML) algorithms trained on expert-labeled VS data streams to automatically classify VS alerts as real or artifact, thereby "cleaning" such data for future modeling. 634 admissions to a step-down unit had recorded continuous noninvasive VS monitoring data [heart rate (HR), respiratory rate (RR), peripheral arterial oxygen saturation (SpO) at 1/20 Hz, and noninvasive oscillometric blood pressure (BP)]. Time data were across stability thresholds defined VS event epochs. Data were divided Block 1 as the ML training/cross-validation set and Block 2 the test set. Expert clinicians annotated Block 1 events as perceived real or artifact. After feature extraction, ML algorithms were trained to create and validate models automatically classifying events as real or artifact. The models were then tested on Block 2. Block 1 yielded 812 VS events, with 214 (26 %) judged by experts as artifact (RR 43 %, SpO 40 %, BP 15 %, HR 2 %). ML algorithms applied to the Block 1 training/cross-validation set (tenfold cross-validation) gave area under the curve (AUC) scores of 0.97 RR, 0.91 BP and 0.76 SpO. Performance when applied to Block 2 test data was AUC 0.94 RR, 0.84 BP and 0.72 SpO. ML-defined algorithms applied to archived multi-signal continuous VS monitoring data allowed accurate automated classification of VS alerts as real or artifact, and could support data mining for future model building.
庞大的医院信息系统数据库可用于知识发现和决策支持,但存储的非侵入性生命体征(VS)高频数据流中的伪迹限制了其应用。我们使用在专家标记的VS数据流上训练的机器学习(ML)算法,将VS警报自动分类为真实警报或伪迹,从而为未来建模“清理”此类数据。634例转入降级护理病房的患者记录了连续的非侵入性VS监测数据[心率(HR)、呼吸频率(RR)、外周动脉血氧饱和度(SpO),采样频率为1/20 Hz,以及无创示波血压(BP)]。时间数据跨越定义VS事件时期的稳定性阈值。数据分为第1组作为ML训练/交叉验证集,第2组作为测试集。专家临床医生将第1组事件标注为真实事件或伪迹。经过特征提取后,训练ML算法以创建和验证将事件自动分类为真实事件或伪迹的模型。然后在第2组上对模型进行测试。第1组产生了812个VS事件,专家判断其中214个(26%)为伪迹(RR为43%,SpO为40%,BP为15%,HR为2%)。应用于第1组训练/交叉验证集(十折交叉验证)的ML算法得出曲线下面积(AUC)分数,RR为0.97,BP为0.91,SpO为0.76。应用于第2组测试数据时的性能为AUC,RR为0.94,BP为0.84,SpO为