首页> 外文OA文献 >A Computational Model Of The Intelligibility Of American Sign Language Video And Video Coding Applications
【2h】

A Computational Model Of The Intelligibility Of American Sign Language Video And Video Coding Applications

机译:美国手语视频和视频编码应用的可懂度计算模型

摘要

Real-time, two-way transmission of American Sign Language (ASL) video over cellular networks provides natural communication among members of the Deaf community. Bandwidth restrictions on cellular networks and limited computational power on cellular devices necessitate the use of advanced video coding techniques designed explicitly for ASL video. As a communication tool, compressed ASL video must be evaluated according to the intelligibility of the conversation, not according to conventional definitions of video quality. The intelligibility evaluation can either be performed using human subjects participating in perceptual experiments or using computational models suitable for ASL video. This dissertation addresses each of these issues in turn, presenting a computational model of the intelligibility of ASL video, which is demonstrated to be accurate with respect to true intelligibility ratings as provided by human subjects. The computational model affords the development of video compression techniques that are optimized for ASL video. Guided by linguistic principles and human perception of ASL, this dissertation presents a full-reference computational model of intelligibility for ASL (CIM-ASL) that is suitable for evaluating compressed ASL video. The CIM-ASL measures distortions only in regions relevant for ASL communication, using spatial and temporal pooling mechanisms that vary the contribution of distortions according to their relative impact on the intelligibility of the compressed video. The model is trained and evaluated using ground truth experimental data, collected in three separate perceptual studies. The CIM-ASL provides accurate estimates of subjective intelligibility and demonstrates statistically significant improvements over computational models traditionally used to estimate video quality. The CIM-ASL is incorporated into an H.264/AVC compliant video coding framework, creating a closed-loop encoding system optimized explicitly for ASL intelligibility. This intelligibility optimized coder achieves bitrate reductions between 10% and 42% without reducing intelligibility, when compared to a general purpose H.264/AVC encoder. The intelligibility optimized encoder is refined by introducing reduced complexity encoding modes, which yield a 16% improvement in encoding speed. The purpose of the intelligibility optimized encoder is to generate video that is suitable for real-time ASL communication. Ultimately, the preferences of ASL users determine the success of the intelligibility optimized coder. User preferences are explicitly evaluated in a perceptual experiment in which ASL users select between the intelligibility optimized coder and a general purpose video coder. The results of this experiment demonstrate that the preferences vary depending on the demographics of the participants and that a significant proportion of users prefer the intelligibility optimized coder.
机译:通过蜂窝网络实时双向传输美国手语(ASL)视频,在聋人社区成员之间提供了自然的交流。蜂窝网络的带宽限制和蜂窝设备的有限计算能力使得必须使用专门为ASL视频设计的高级视频编码技术。作为一种通信工具,压缩的ASL视频必须根据对话的清晰度来评估,而不是根据视频质量的常规定义来评估。可清晰度评估可以使用参与感知实验的人类受试者进行,也可以使用适用于ASL视频的计算模型来进行。本文依次解决了这些问题,提出了ASL视频清晰度的计算模型,该模型对于人类受试者提供的真实清晰度等级而言是准确的。该计算模型提供了针对ASL视频优化的视频压缩技术的开发。本文以语言学原理和人类对ASL的感知为指导,提出了适用于压缩ASL视频评估的ASL可理解性全参考计算模型(CIM-ASL)。 CIM-ASL使用空间和时间缓冲机制,仅根据与压缩视频的清晰度有关的相对影响来改变失真的影响,从而仅在与ASL通信相关的区域中测量失真。使用在三个单独的感知研究中收集的地面真实实验数据对模型进行训练和评估。 CIM-ASL提供了对主观可懂度的准确估计,并证明了对传统上用于估计视频质量的计算模型的统计意义上的重大改进。 CIM-ASL被并入符合H.264 / AVC的视频编码框架,从而创建了针对ASL可懂度进行了显着优化的闭环编码系统。与通用H.264 / AVC编码器相比,此可优化清晰度的编码器可将比特率降低10%至42%,而不会降低清晰度。通过引入复杂度降低的编码模式来完善可清晰度优化的编码器,从而使编码速度提高16%。可清晰度优化的编码器的目的是生成适合于实时ASL通信的视频。最终,ASL用户的偏好决定了清晰度优化编码器的成功。用户偏好是在感知实验中明确评估的,在该实验中,ASL用户在清晰度优化编码器和通用视频编码器之间进行选择。该实验的结果表明,偏好因参与者的人口统计学而异,并且很大一部分用户更喜欢可清晰度优化的编码器。

著录项

  • 作者

    Ciaramello Francis;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 en_US
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号