CERVO Brain Research Center, Québec, Québec, Canada.
Physics Department, Université Laval, Québec, Québec, Canada.
Biomed Phys Eng Express. 2024 Sep 13;10(6). doi: 10.1088/2057-1976/ad72f9.
Some pathologies such as cancer and dementia require multiple imaging modalities to fully diagnose and assess the extent of the disease. Magnetic resonance imaging offers this kind of polyvalence, but examinations take time and can require contrast agent injection. The flexible synthesis of these imaging sequences based on the available ones for a given patient could help reduce scan times or circumvent the need for contrast agent injection. In this work, we propose a deep learning architecture that can perform the synthesis of all missing imaging sequences from any subset of available images. The network is trained adversarially, with the generator consisting of parallel 3D U-Net encoders and decoders that optimally combines their multi-resolution representations with a fusion operation learned by an attention network trained conjointly with the generator network. We compare our synthesis performance with 3D networks using other types of fusion and a comparable number of trainable parameters, such as the mean/variance fusion. In all synthesis scenarios except one, the synthesis performance of the network using attention-guided fusion was better than the other fusion schemes. We also inspect the encoded representations and the attention network outputs to gain insights into the synthesis process, and uncover desirable behaviors such as prioritization of specific modalities, flexible construction of the representation when important modalities are missing, and modalities being selected in regions where they carry sequence-specific information. This work suggests that a better construction of the latent representation space in hetero-modal networks can be achieved by using an attention network.
一些病理学,如癌症和痴呆症,需要多种成像方式来全面诊断和评估疾病的严重程度。磁共振成像提供了这种多功能性,但检查需要时间,并且可能需要注射造影剂。基于给定患者可用的成像序列,灵活地合成这些成像序列,可以帮助减少扫描时间或避免使用造影剂。在这项工作中,我们提出了一种深度学习架构,可以从任何可用图像的子集合成所有缺失的成像序列。该网络通过对抗性训练,生成器由并行的 3D U-Net 编码器和解码器组成,它们通过注意力网络学习的融合操作,最佳地组合它们的多分辨率表示,该注意力网络与生成器网络一起训练。我们将我们的合成性能与使用其他融合类型和可比数量的可训练参数(如均值/方差融合)的 3D 网络进行了比较。在除一种情况外的所有合成场景中,使用注意力引导融合的网络的合成性能都优于其他融合方案。我们还检查了编码表示和注意力网络的输出,以深入了解合成过程,并揭示了一些理想的行为,如对特定模态的优先级、在重要模态缺失时灵活构建表示、以及在携带序列特定信息的区域中选择模态。这项工作表明,通过使用注意力网络,可以在异模态网络中更好地构建潜在表示空间。