Modality.AI, Inc., San Francisco, CA.
Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco.
J Speech Lang Hear Res. 2024 Nov 7;67(11):4233-4245. doi: 10.1044/2024_JSLHR-24-00142. Epub 2024 Jul 10.
Automated remote assessment and monitoring of patients' neurological and mental health is increasingly becoming an essential component of the digital clinic and telehealth ecosystem, especially after the COVID-19 pandemic. This review article reviews various modalities of health information that are useful for developing such remote clinical assessments in the real world at scale.
We first present an overview of the various modalities of health information-speech acoustics, natural language, conversational dynamics, orofacial or full body movement, eye gaze, respiration, cardiopulmonary, and neural-which can each be extracted from various signal sources-audio, video, text, or sensors. We further motivate their clinical utility with examples of how information from each modality can help us characterize how different disorders affect different aspects of patients' spoken communication. We then elucidate the advantages of combining one or more of these modalities toward a more holistic, informative, and robust assessment.
We find that combining multiple modalities of health information allows for improved scientific interpretability, improved performance on downstream health applications such as early detection and progress monitoring, improved technological robustness, and improved user experience. We illustrate how these principles can be leveraged for remote clinical assessment at scale using a real-world case study of the Modality assessment platform.
This review article motivates the combination of human-centric information from multiple modalities to measure various aspects of patients' health, arguing that remote clinical assessment that integrates this complementary information can be more effective and lead to better clinical outcomes than using any one data stream in isolation.
自动化远程评估和监测患者的神经和心理健康,正在日益成为数字诊所和远程医疗生态系统的重要组成部分,尤其是在 COVID-19 大流行之后。本文综述了各种健康信息模式,这些模式对于在现实世界中大规模开发此类远程临床评估非常有用。
我们首先概述了各种健康信息模式——语音声学、自然语言、会话动态、口面部或全身运动、眼动、呼吸、心肺和神经——这些模式都可以从各种信号源(音频、视频、文本或传感器)中提取出来。我们进一步通过示例说明每个模式的信息如何帮助我们描述不同疾病如何影响患者言语交流的不同方面,从而证明它们的临床实用性。然后,我们阐明了结合一种或多种这些模式的优势,以实现更全面、信息更丰富、更稳健的评估。
我们发现,结合多种健康信息模式可以提高科学可解释性,提高下游健康应用(如早期检测和进展监测)的性能,提高技术稳健性,并提高用户体验。我们通过使用 Modality assessment platform 的真实案例研究来说明如何利用这些原则进行大规模远程临床评估。
本文综述强调了从多个模式中结合以人为中心的信息来衡量患者健康的各个方面的重要性,认为整合这些互补信息的远程临床评估可以比单独使用任何一种数据流更有效,从而带来更好的临床结果。