Signal Processing Laboratory, Swiss Federal Institute of Technology, Lausanne CH-1015, Switzerland.
IEEE Trans Image Process. 2013 Sep;22(9):3459-72. doi: 10.1109/TIP.2013.2270183. Epub 2013 Jun 19.
Enabling users to interactively navigate through different viewpoints of a static scene is a new interesting functionality in 3D streaming systems. While it opens exciting perspectives toward rich multimedia applications, it requires the design of novel representations and coding techniques to solve the new challenges imposed by the interactive navigation. In particular, the encoder must prepare a priori a compressed media stream that is flexible enough to enable the free selection of multiview navigation paths by different streaming media clients. Interactivity clearly brings new design constraints: the encoder is unaware of the exact decoding process, while the decoder has to reconstruct information from incomplete subsets of data since the server generally cannot transmit images for all possible viewpoints due to resource constrains. In this paper, we propose a novel multiview data representation that permits us to satisfy bandwidth and storage constraints in an interactive multiview streaming system. In particular, we partition the multiview navigation domain into segments, each of which is described by a reference image (color and depth data) and some auxiliary information. The auxiliary information enables the client to recreate any viewpoint in the navigation segment via view synthesis. The decoder is then able to navigate freely in the segment without further data request to the server; it requests additional data only when it moves to a different segment. We discuss the benefits of this novel representation in interactive navigation systems and further propose a method to optimize the partitioning of the navigation domain into independent segments, under bandwidth and storage constraints. Experimental results confirm the potential of the proposed representation; namely, our system leads to similar compression performance as classical inter-view coding, while it provides the high level of flexibility that is required for interactive streaming. Because of these unique properties, our new framework represents a promising solution for 3D data representation in novel interactive multimedia services.
使用户能够交互地浏览静态场景的不同视角是 3D 流媒体系统中的一项新的有趣功能。虽然它为丰富的多媒体应用开辟了令人兴奋的前景,但它需要设计新的表示和编码技术来解决由交互导航带来的新挑战。特别是,编码器必须预先准备一个压缩的媒体流,该流具有足够的灵活性,以允许不同的流媒体客户端自由选择多视角导航路径。交互性显然带来了新的设计约束:编码器不知道确切的解码过程,而解码器必须从数据的不完整子集重建信息,因为由于资源限制,服务器通常无法传输所有可能视点的图像。在本文中,我们提出了一种新的多视图数据表示方法,使我们能够在交互式多视图流媒体系统中满足带宽和存储约束。特别是,我们将多视图导航域划分为多个段,每个段由参考图像(颜色和深度数据)和一些辅助信息描述。辅助信息使客户端能够通过视图合成在导航段中重新创建任何视点。然后,解码器可以在没有向服务器进一步请求数据的情况下在段内自由导航;只有当它移动到不同的段时,它才会请求额外的数据。我们讨论了这种新表示在交互导航系统中的优势,并进一步提出了一种在带宽和存储约束下将导航域划分为独立段的方法。实验结果证实了所提出的表示方法的潜力;也就是说,我们的系统在提供交互流媒体所需的高度灵活性的同时,实现了与传统视图间编码相当的压缩性能。由于这些独特的特性,我们的新框架代表了在新的交互式多媒体服务中进行 3D 数据表示的有前途的解决方案。