Gu Yue, Li Xinyu, Chen Shuhong, Li Hunagcan, Farneth Richard A, Marsic Ivan, Burd Randall S
Department of Electrical and Computer Engineering, Rutgers University, Piscataway, NJ, USA.
Trauma and Burn Surgery, Children's National Medical Center, Washington, DC, USA.
Proc (IEEE Int Conf Healthc Inform). 2017 Aug;2017:239-247. doi: 10.1109/ICHI.2017.50. Epub 2017 Sep 14.
Process phase detection has been widely used in surgical process modeling (SPM) to track process progression. These studies mostly used video and embedded sensor data, but spoken language also provides rich semantic information directly related to process progression. We present a long-short term memory (LSTM) deep learning model to predict trauma resuscitation phases using verbal communication logs. We first use an LSTM to extract the sentence meaning representations, and then sequentially feed them into another LSTM to extract the meaning of a sentence group within a time window. This information is ultimately used for phase prediction. We used 24 manually-transcribed trauma resuscitation cases to train, and the remaining 6 cases to test our model. We achieved 79.12% accuracy, and showed performance advantages over existing visual-audio systems for critical phases of the process. In addition to language information, we evaluated a multimodal phase prediction structure that also uses audio input. We finally identified the challenges of substituting manual transcription with automatic speech recognition in trauma resuscitation.
过程阶段检测已广泛应用于手术过程建模(SPM)以跟踪过程进展。这些研究大多使用视频和嵌入式传感器数据,但口语也提供了与过程进展直接相关的丰富语义信息。我们提出了一种长短期记忆(LSTM)深度学习模型,用于使用言语交流日志预测创伤复苏阶段。我们首先使用LSTM提取句子意义表示,然后将它们依次输入另一个LSTM以提取时间窗口内句子组的意义。此信息最终用于阶段预测。我们使用24个手动转录的创伤复苏病例进行训练,其余6个病例用于测试我们的模型。我们实现了79.12%的准确率,并在过程的关键阶段显示出优于现有视听系统的性能。除了语言信息,我们还评估了一种也使用音频输入的多模态阶段预测结构。我们最终确定了在创伤复苏中用自动语音识别替代手动转录的挑战。