IEEE J Biomed Health Inform. 2018 Jul;22(4):1177-1188. doi: 10.1109/JBHI.2017.2726180. Epub 2017 Jul 12.
The wider adoption of mobile Health video communication systems in standard clinical practice requires real-time control to provide for adequate levels of clinical video quality to support reliable diagnosis. The latter can only be achieved with real-time adaptation to time-varying wireless networks' state to guarantee clinically acceptable performance throughout the streaming session, while conforming to device capabilities for supporting real-time encoding. We propose an adaptive video encoding framework based on multi-objective optimization that jointly maximizes the encoded video's quality and encoding rate (in frames per second) while minimizing bitrate demands. For this purpose, we construct a dense encoding space and use linear regression to estimate forward prediction models for quality, bitrate, and computational complexity. The prediction models are then used in an adaptive control framework that can fine-tune video encoding based on real-time constraints. We validate the system using a leave-one-out algorithm applied to ten ultrasound videos of the common carotid artery. The prediction models can estimate structural similarity quality with a median accuracy error of less than 1%, bitrate demands with deviation error of 10% or less, and encoding frame rate within a 6% margin. Real-time adaptation at a group of pictures level is demonstrated using the high efficiency video coding standard. The effectiveness of the proposed framework compared to static, nonadaptive approaches is demonstrated for different modes of operation, achieving significant quality gains, bitrate demands reductions, and performance improvements, in real-life scenarios imposing time-varying constraints. Our approach is generic and should be applicable to other medical video modalities with different applications.
广泛采用移动医疗视频通信系统进行标准临床实践需要实时控制,以提供足够水平的临床视频质量,支持可靠的诊断。只有通过实时适应时变无线网络的状态,才能实现后者,以保证在整个流媒体会话期间符合临床可接受的性能,同时符合设备支持实时编码的能力。我们提出了一种基于多目标优化的自适应视频编码框架,该框架联合最大化编码视频的质量和编码率(每秒帧数),同时最小化比特率需求。为此,我们构建了一个密集的编码空间,并使用线性回归来估计质量、比特率和计算复杂度的前向预测模型。然后,预测模型用于自适应控制框架中,可以根据实时约束精细调整视频编码。我们使用一种适用于颈总动脉的十张超声视频的留一法验证系统。预测模型可以以中位数误差小于 1%的方式估计结构相似性质量,以偏差误差小于 10%的方式估计比特率需求,并以 6%的余量估计编码帧率。使用高效视频编码标准在一个图像组级别实现实时自适应。与静态、非自适应方法相比,所提出的框架在不同操作模式下的有效性得到了证明,在实时场景中引入时变约束时,实现了显著的质量增益、比特率需求降低和性能提高。我们的方法是通用的,应该适用于具有不同应用的其他医学视频模式。