首页> 外文学位 >Multiple global affine motion models used in video coding.
【24h】

Multiple global affine motion models used in video coding.

机译:视频编码中使用了多个全局仿射运动模型。

获取原文
获取原文并翻译 | 示例

摘要

The research presented in this dissertation explores a hybrid video codec's performance by simplifying its motion structure, instead of complicating it. This is in contrast to the latest compression standard H.264, and the majority of video researchers who are exploring more complex motion models.; Specifically, we propose to use global motion models instead of local block-wise motion vectors to compress motion information between consecutive frames. To cover the frequently occurring operations of rotation and zooming in global motion, a 6-D affine model is adopted instead of the more common 2-D translational one. To account for multiple motion objects in a video frame, motion segmentation is implemented based on the scalable motion field of an H.264 encoder. An affine model for each segment is estimated and used for global motion compensation of the corresponding areas. A warped reconstruction of the entire video frame is constructed using the segmentation map. The multiple affine models are predicatively compressed with a specially designed vector quantizer, which consists of a long main dictionary stored off-line and a short cache word list on line. The cache word list is searched for a match each time an affine model is quantized. The main dictionary is checked only when a "miss" happens. While reconstructing the current frame with multiple affine models, the proposed video codec system does not discard the classical block-matched reconstruction of each macroblock. Specifically, a macroblock can be reconstructed under any of the original H.264 inter or intra modes, or, with one of the affine models. Hence we add N affine modes to the original macroblock mode list of I4, I16, P16x16, P16x8, P8x16, P8x8, P4x4 and DIRECT, where N is the number of countable motion objects in the frame. One of the new modes is chosen by Lagrange optimization. By elongating the mode list and spending moderately more bits on mode indication, we save the encoder the prohibitive effort of transmitting a segmentation map to the decoder.; Finally we present the experiment results of our system, in comparison with the latest published version of JM, the H.264 codec reference software. Our system manifests a notable gain (up to 0.8 dB) in rate-distortion performance when the video stream bit rate is below 100 kbps. 30%-70% of the macroblocks in a P-frame end up being encoded by the affine modes. The proposed system also shows many other advantages over traditional codecs, such as less pronounced blocking artifacts and more error resilience.
机译:本文的研究通过简化混合视频编解码器的运动结构而不是使其复杂化来探索其性能。这与最新的压缩标准H.264和正在探索更复杂的运动模型的大多数视频研究人员形成鲜明对比。具体来说,我们建议使用全局运动模型代替局部逐块运动矢量来压缩连续帧之间的运动信息。为了涵盖旋转和缩放全局运动中经常发生的操作,采用了6维仿射模型,而不是更常见的2维平移模型。为了解决视频帧中的多个运动对象,基于H.264编码器的可伸缩运动字段实现了运动分割。估计每个片段的仿射模型,并将其用于相应区域的全局运动补偿。使用分割图构造整个视频帧的扭曲重构。可以使用专门设计的矢量量化器对多个仿射模型进行压缩,该矢量量化器由离线存储的长主词典和在线存储的短缓存字表组成。每当仿射模型被量化时,就在高速缓存字列表中搜索匹配项。仅当“未命中”发生时才检查主词典。当使用多个仿射模型重建当前帧时,所提出的视频编解码器系统不会丢弃每个宏块的经典块匹配重建。具体而言,可以在任何原始H.264帧间或帧内模式下,或者使用仿射模型之一来重建宏块。因此,我们将N个仿射模式添加到I4,I16,P16x16,P16x8,P8x16,P8x8,P4x4和DIRECT的原始宏块模式列表中,其中N是帧中可计数运动对象的数量。拉格朗日优化选择了一种新模式。通过拉长模式列表并适度地在模式指示上花费更多的比特,我们为编码器节省了将分段图发送到解码器的繁琐工作。最后,与最新发布的H.264编解码器参考软件JM相比,我们给出了系统的实验结果。当视频流比特率低于100 kbps时,我们的系统在速率失真性能上显示出显着的增益(高达0.8 dB)。 P帧中30%-70%的宏块最终被仿射模式编码。与传统的编解码器相比,提出的系统还显示出许多其他优点,例如不太明显的阻塞伪像和更多的错误恢复能力。

著录项

  • 作者

    Li, Xiaohuan.;

  • 作者单位

    Georgia Institute of Technology.;

  • 授予单位 Georgia Institute of Technology.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2007
  • 页码 97 p.
  • 总页数 97
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号