首页> 外文OA文献 >JPEG2000-based scalable interactive video (JSIV)
【2h】

JPEG2000-based scalable interactive video (JSIV)

机译:基于JPEG2000的可伸缩交互式视频(JSIV)

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Video is considered one of the main applications of modern day's Internet. Despite its importance, the interactivity available from current implementations is limited to pause and random access to a set of predetermined access points. In this work, we propose a novel and innovative approach which provides considerably better interactivity and we coin the term JPEG2000-Based Scalable Interactive Video (JSIV) for it. JSIV relies on three main concepts: storing the video sequence as independent JPEG2000 frames to provide for quality and spatial resolution scalability, as well as temporal and spatial accessibility; prediction and conditional replenishment of precincts to exploit inter-frame redundancy; and loosely-coupled server and client policies. The concept of loosely-coupled client and server policies is central to JSIV. With these policies, the server optimally selects the number of quality layers for each precinct it transmits and decides on any side-information that needs to be transmitted while the client attempts to make most of the received (distorted) frames. In particular, the client decides which precincts are predicted and which are decoded from received data (or possibly filled with zeros in the absence of received data). Thus, in JSIV, a predicted frame typically has some of its precincts predicted from nearby frames while others are decoded from received intra-coded precincts; JSIV never uses frame differences or prediction residues.The philosophy behind these policies is that neither the server nor the client drives the video streaming interaction, but rather the server dynamically selects and sends the pieces that, it thinks, best serve the client needs and, in turn, the client makes most of the pieces of information it has. The JSIV paradigm postulates that if both the client and the server policies are intelligent enough and make reasonable decisions, then the decisions made by the server are likely to have the expected impact on the client's decisions. We solve the general JSIV optimization problem by employing Lagrange-style rate-distortion optimization in a two pass iterative approach. We show that this approach converges under workable conditions, and we also show that the optimal solution for a given rate is not necessarily embedded in the optimal solution for a higher rate. The flexibility of the JSIV paradigm enables us to use it in a variety of frame prediction arrangements. In this work, we focus only on JSIV with sequential prediction arrangement (similar to IPPPldots) and hierarchical B-frames prediction arrangement.We show that JSIV can provide the sought-after quality and spatial scalability in addition to temporal and spatial accessibility. We also demonstrate a novel way in which a JSIV client can use its cache in improving the quality of reconstructed video. In general, JSIV can serve a wide range of usage scenarios, but we expect that real-time and interactive applications, such as teleconferencing and surveillance, would benefit most from it. Experimental results show that JSIV's performance is slightly inferior to that of existing predictive coding standards in conventional streaming applications; however, JSIV produces significant improvements when its scalability and accessibility features, such as the region of interest, are employed.
机译:视频被认为是当今互联网的主要应用之一。尽管它很重要,但当前实现中可用的交互性仅限于暂停和对一组预定访问点的随机访问。在这项工作中,我们提出了一种新颖且创新的方法,该方法提供了更好的交互性,为此我们创造了术语“基于JPEG2000的可伸缩交互式视频(JSIV)”。 JSIV依赖于三个主要概念:将视频序列存储为独立的JPEG2000帧,以提供质量和空间分辨率可伸缩性以及时间和空间可访问性;预测和有条件地补充区域以利用帧间冗余;以及松散耦合的服务器和客户端策略。客户端和服务器策略松散耦合的概念是JSIV的核心。使用这些策略,服务器可以为每个传输的区域最佳地选择质量层的数量,并在客户端尝试制作大部分接收到的(失真的)帧时决定需要传输的任何边信息。具体而言,客户端决定从接收到的数据中预测哪些区域,以及从接收到的数据中解码哪些区域(或者在没有接收到的数据的情况下,可能用零填充)。因此,在JSIV中,预测帧通常具有从附近帧中预测的某些区域,而其他区域则是从接收到的帧内编码区域中解码的。 JSIV从未使用帧差异或预测残差。这些政策背后的理念是,服务器和客户端都不驱动视频流交互,而是服务器动态选择并发送它认为最能满足客户端需求的片段,并且反过来,客户将获得其拥有的大部分信息。 JSIV范例假定,如果客户端和服务器策略都足够智能并且可以做出合理的决策,则服务器做出的决策很可能会对客户端的决策产生预期的影响。我们通过在两次遍历迭代方法中采用拉格朗日样式率失真优化来解决一般的JSIV优化问题。我们显示了这种方法在可行条件下收敛,并且还显示了给定速率的最优解决方案不一定嵌入到更高速率的最优解决方案中。 JSIV范例的灵活性使我们能够在各种帧预测安排中使用它。在这项工作中,我们仅关注具有顺序预测安排(类似于IPPP ldots)和分层B帧预测安排的JSIV。我们证明JSIV除了可以提供时间和空间可访问性之外,还可以提供广受欢迎的质量和空间可伸缩性。我们还演示了一种新颖的方式,其中JSIV客户端可以使用其缓存来提高重构视频的质量。总体而言,JSIV可以满足广泛的使用场景,但是我们希望实时,交互式应用程序(例如电话会议和监视)将从中受益最大。实验结果表明,JSIV的性能略低于常规流应用程序中现有的预测编码标准。但是,当使用JSIV的可伸缩性和可访问性功能(例如感兴趣的区域)时,将产生重大改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号