Zhou Chaochao, Faruqui Syed Hasib Akhter, An Dayeong, Patel Abhinav, Abdalla Ramez N, Hurley Michael C, Shaibani Ali, Potts Matthew B, Jahromi Babak S, Ansari Sameer A, Cantrell Donald R
Department of Radiology, Northwestern Medicine, Northwestern University, Chicago, IL, USA.
Department of Neurology, Northwestern Medicine, Northwestern University, Chicago, IL, USA.
J Imaging Inform Med. 2024 Dec 13. doi: 10.1007/s10278-024-01354-w.
Many tasks performed in image-guided procedures can be cast as pose estimation problems, where specific projections are chosen to reach a target in 3D space. In this study, we construct a framework for fluoroscopic pose estimation and compare alternative loss functions and volumetric scene representations. We first develop a differentiable projection (DiffProj) algorithm for the efficient computation of Digitally Reconstructed Radiographs (DRRs) from either Cone-Beam Computerized Tomography (CBCT) or neural scene representations. We introduce two innovative neural scene representations, Neural Tuned Tomography (NeTT) and masked Neural Radiance Fields (mNeRF). Pose estimation is then performed within the framework by iterative gradient descent using loss functions that quantify the image discrepancy of the synthesized DRR with respect to the ground-truth, target fluoroscopic X-ray image. We compared alternative loss functions and volumetric scene representations for pose estimation using a dataset consisting of 50 cranial tomographic X-ray sequences. We find that Mutual Information significantly outperforms alternative loss functions for pose estimation, avoiding entrapment in local optima. The alternative discrete (CBCT) and neural (NeTT and mNeRF) volumetric scene representations yield comparable performance (3D angle errors, mean ≤ 3.2° and 90% quantile ≤ 3.4°); however, the neural scene representations incur a considerable computational expense to train.
在图像引导手术中执行的许多任务都可以归结为姿态估计问题,即选择特定的投影以在三维空间中到达目标。在本研究中,我们构建了一个用于荧光透视姿态估计的框架,并比较了替代损失函数和体场景表示。我们首先开发了一种可微投影(DiffProj)算法,用于从锥束计算机断层扫描(CBCT)或神经场景表示中高效计算数字重建射线照片(DRR)。我们引入了两种创新的神经场景表示,即神经调谐断层扫描(NeTT)和掩码神经辐射场(mNeRF)。然后,在该框架内通过迭代梯度下降进行姿态估计,使用损失函数来量化合成的DRR与真实目标荧光透视X射线图像之间的图像差异。我们使用由50个颅骨断层X射线序列组成的数据集,比较了姿态估计的替代损失函数和体场景表示。我们发现,互信息在姿态估计方面显著优于替代损失函数,避免陷入局部最优。替代的离散(CBCT)和神经(NeTT和mNeRF)体场景表示产生了可比的性能(三维角度误差,平均值≤3.2°,90%分位数≤3.4°);然而,神经场景表示在训练时会产生相当大的计算成本。