Cui Can, Yang Haichun, Wang Yaohong, Zhao Shilin, Asad Zuhayr, Coburn Lori A, Wilson Keith T, Landman Bennett A, Huo Yuankai
Department of Computer Science, Vanderbilt University, Nashville, TN 37235, United States of America.
Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN 37215, United States of America.
Prog Biomed Eng (Bristol). 2023 Apr 11;5(2). doi: 10.1088/2516-1091/acc2fe.
The rapid development of diagnostic technologies in healthcare is leading to higher requirements for physicians to handle and integrate the heterogeneous, yet complementary data that are produced during routine practice. For instance, the personalized diagnosis and treatment planning for a single cancer patient relies on various images (e.g. radiology, pathology and camera images) and non-image data (e.g. clinical data and genomic data). However, such decision-making procedures can be subjective, qualitative, and have large inter-subject variabilities. With the recent advances in multimodal deep learning technologies, an increasingly large number of efforts have been devoted to a key question: how do we extract and aggregate multimodal information to ultimately provide more objective, quantitative computer-aided clinical decision making? This paper reviews the recent studies on dealing with such a question. Briefly, this review will include the (a) overview of current multimodal learning workflows, (b) summarization of multimodal fusion methods, (c) discussion of the performance, (d) applications in disease diagnosis and prognosis, and (e) challenges and future directions.
医疗保健领域诊断技术的快速发展,对医生处理和整合日常实践中产生的异构但互补的数据提出了更高要求。例如,针对单个癌症患者的个性化诊断和治疗计划依赖于各种图像(如放射学、病理学和相机图像)和非图像数据(如临床数据和基因组数据)。然而,此类决策过程可能具有主观性、定性性,并且个体间差异很大。随着多模态深度学习技术的最新进展,越来越多的努力致力于一个关键问题:我们如何提取和聚合多模态信息,以最终提供更客观、定量的计算机辅助临床决策?本文综述了近期关于处理这一问题的研究。简而言之,本综述将包括(a)当前多模态学习工作流程概述,(b)多模态融合方法总结,(c)性能讨论,(d)在疾病诊断和预后中的应用,以及(e)挑战和未来方向。