Lee SiYeoul, Kim Seonho, Seo MinKyung, Park SeongKyu, Imrus Salehin, Ashok Kambaluru, Lee DongEon, Park Chunsu, Lee SeonYeong, Kim Jiye, Yoo Jae-Heung, Kim MinWoo
IEEE Trans Med Imaging. 2025 Jun 13;PP. doi: 10.1109/TMI.2025.3579454.
This study introduces a motion-based learning network with a global-local self-attention module (MoGLo-Net) to enhance 3D reconstruction in handheld photoacoustic and ultrasound (PAUS) imaging. Standard PAUS imaging is often limited by a narrow field of view (FoV) and the inability to effectively visualize complex 3D structures. The 3D freehand technique, which aligns sequential 2D images for 3D reconstruction, faces significant challenges in accurate motion estimation without relying on external positional sensors. MoGLo-Net addresses these limitations through an innovative adaptation of the self-attention mechanism, which effectively exploits the critical regions, such as fully-developed speckle areas or high-echogenic tissue regions within successive ultrasound images to accurately estimate the motion parameters. This facilitates the extraction of intricate features from individual frames. Additionally, we employ a patch-wise correlation operation to generate a correlation volume that is highly correlated with the scanning motion. A custom loss function was also developed to ensure robust learning with minimized bias, leveraging the characteristics of the motion parameters. Experimental evaluations demonstrated that MoGLo-Net surpasses current state-of-the-art methods in both quantitative and qualitative performance metrics. Furthermore, we expanded the application of 3D reconstruction technology beyond simple B-mode ultrasound volumes to incorporate Doppler ultrasound and photoacoustic imaging, enabling 3D visualization of vasculature. The source code for this study is publicly available at: https://github.com/pnu-amilab/US3D.
本研究引入了一种带有全局-局部自注意力模块的基于运动的学习网络(MoGLo-Net),以增强手持式光声和超声(PAUS)成像中的三维重建。标准的PAUS成像通常受到窄视野(FoV)的限制,并且无法有效地可视化复杂的三维结构。三维徒手技术通过对齐连续的二维图像进行三维重建,但在不依赖外部位置传感器的情况下进行精确运动估计时面临重大挑战。MoGLo-Net通过对自注意力机制的创新改编解决了这些限制,该机制有效地利用关键区域,例如连续超声图像中充分发展的散斑区域或高回声组织区域,以准确估计运动参数。这有助于从单个帧中提取复杂特征。此外,我们采用逐块相关操作来生成与扫描运动高度相关的相关体积。还开发了一个定制损失函数,利用运动参数的特性确保稳健学习并使偏差最小化。实验评估表明,MoGLo-Net在定量和定性性能指标方面均超越了当前的先进方法。此外,我们将三维重建技术的应用扩展到简单的B模式超声体积之外,纳入了多普勒超声和光声成像,实现了血管系统的三维可视化。本研究的源代码可在以下网址公开获取:https://github.com/pnu-amilab/US3D。