Banerji Christopher R S, Bhardwaj Shah Aroon, Dabson Ben, Chakraborti Tapabrata, Hellon Vicky, Harbron Chris, MacArthur Ben D
The Alan Turing Institute, London, UK.
University College London Hospital, University College London Hospitals NHS Trust, London, UK.
EClinicalMedicine. 2025 May 23;84:103252. doi: 10.1016/j.eclinm.2025.103252. eCollection 2025 Jun.
Multimodal artificial intelligence (AI) is a powerful new technological advance, capable of simultaneously learning from diverse data types, such as text, images, video, and audio. Because clinical decisions are usually based on information from multiple sources, multimodal AI has the potential to significantly improve clinical practice. However, unlike most developed multimodal AI workflows, clinical medicine is both a dynamic and interventional process in which the clinician continually learns about the patient's health and acts accordingly as data is collected. In this article we argue that multimodal clinical AI must be fully attuned to the particular challenges and constraints of the clinic, and clinician involvement is needed throughout development-not just at clinical deployment. We propose ways that clinician involvement can add value at each stage of the multimodal AI development pipeline, and argue for the establishment of actively managed multidisciplinary communities to work collaboratively towards the shared goal of improving the health of all.
多模态人工智能(AI)是一项强大的新技术进展,能够同时从多种数据类型(如文本、图像、视频和音频)中学习。由于临床决策通常基于来自多个来源的信息,多模态人工智能有潜力显著改善临床实践。然而,与大多数已开发的多模态人工智能工作流程不同,临床医学是一个动态的、干预性的过程,在此过程中,临床医生会随着数据的收集不断了解患者的健康状况并据此采取行动。在本文中,我们认为多模态临床人工智能必须充分适应临床的特殊挑战和限制,并且在整个开发过程中都需要临床医生的参与——而不仅仅是在临床部署阶段。我们提出了临床医生参与可以在多模态人工智能开发流程的每个阶段增加价值的方法,并主张建立积极管理的多学科社区,以共同努力实现改善所有人健康的共同目标。