首页> 外文会议>International Conference on Quality of Multimedia Experience >How Deep is Your Encoder: An Analysis of Features Descriptors for an Autoencoder-Based Audio-Visual Quality Metric
【24h】

How Deep is Your Encoder: An Analysis of Features Descriptors for an Autoencoder-Based Audio-Visual Quality Metric

机译:您的编码器有多深:基于自动编码器的视听质量指标的功能描述符分析

获取原文

摘要

The development of audio-visual quality assessment models poses a number of challenges in order to obtain accurate predictions. One of these challenges is the modelling of the complex interaction that audio and visual stimuli have and how this interaction is interpreted by human users. The No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder (NAViDAd) deals with this problem from a machine learning perspective. The metric receives two sets of audio and video features descriptors and produces a low-dimensional set of features used to predict the audio-visual quality. A basic implementation of NAViDAd was able to produce accurate predictions tested with a range of different audio-visual databases. The current work performs an ablation study on the base architecture of the metric. Several modules are removed or re-trained using different configurations to have a better understanding of the metric functionality. The results presented in this study provided important feedback that allows us to understand the real capacity of the metric's architecture and eventually develop a much better audio-visual quality metric.
机译:为了获得准确的预测,视听质量评估模型的发展提出了许多挑战。这些挑战之一是对音频和视觉刺激所具有的复杂交互以及人类用户如何解释这种交互进行建模。基于深度自动编码器(NAViDAd)的无引用视听质量度量标准从机器学习的角度解决了此问题。度量标准接收两组音频和视频特征描述符,并生成用于预测视听质量的低维特征集。 NAViDAd的基本实现能够产生经过一系列不同视听数据库测试的准确预测。当前的工作是对度量的基础架构进行消融研究。使用不同的配置删除或重新训练了几个模块,以更好地了解度量功能。这项研究中提供的结果提供了重要的反馈,使我们能够了解度量标准体系结构的实际容量,并最终开发出更好的视听质量度量标准。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号