Li Wensheng, Zeng Lingzhe, Gao Chengying, Liu Ning
IEEE Trans Vis Comput Graph. 2025 Sep;31(9):5494-5506. doi: 10.1109/TVCG.2024.3454467.
While numerous studies have explored NeRF-based novel view synthesis for dynamic humans, they often require training that exceeds several hours, limiting their practicality. Efforts to improve training efficiency have also encountered challenges because it is hard to optimize non-rigid transformations, thus leading to coarse renderings. In this work, we introduce an innovative approach for efficiently learning and integrating neural human representations. To achieve this, we propose a comprehensive utilization of the features stored in both canonical and observational spaces, facilitated through a collaborative refinement process that integrates canonical representations with observational details. Specifically, we initially propose decomposing high-dimensional multi-space feature volume into several feature planes, subsequently utilizing matrix multiplication to explicitly establish the correlations between different planes. This enables the simultaneous optimization of their counterparts across all dimensions by optimizing interpolated features, efficiently integrating associated details, and accelerating the rate of convergence. Additionally, we use the proposed collaborative refinement process to iteratively enhance the canonical representation. By integrating multi-space representations, we further facilitate the co-optimization of multiple frames' time-dependent observations. Experiments demonstrate that our method can achieve high-quality free-viewpoint renderings within nearly 5 minutes of optimization. Compared to state-of-the-art approaches, our results show more realistic rendering details, marking a significant advancement in both performance and efficiency.
虽然众多研究探索了基于神经辐射场(NeRF)的动态人体新视角合成,但它们通常需要超过数小时的训练,这限制了其实用性。提高训练效率的努力也遇到了挑战,因为难以优化非刚性变换,从而导致渲染效果粗糙。在这项工作中,我们引入了一种创新方法,用于高效地学习和整合神经人体表示。为此,我们建议通过一个将规范表示与观察细节相结合的协作细化过程,全面利用存储在规范空间和观察空间中的特征。具体而言,我们首先提出将高维多空间特征体分解为几个特征平面,随后利用矩阵乘法明确建立不同平面之间的相关性。这通过优化插值特征、有效整合相关细节并加快收敛速度,实现了在所有维度上同时优化其对应部分。此外,我们使用所提出的协作细化过程来迭代增强规范表示。通过整合多空间表示,我们进一步促进了多帧时间相关观察的协同优化。实验表明,我们的方法可以在近5分钟的优化时间内实现高质量的自由视角渲染。与现有方法相比,我们的结果显示出更逼真的渲染细节,标志着在性能和效率方面都有显著进步。