【24h】

Fuzzy Audio-Visual Feature Maps for Speaker Identification

机译:模糊的视听特征图用于说话人识别

获取原文
获取原文并翻译 | 示例

摘要

Speech-based person recognition by machine has not reached the level of technological maturity required by some of its potential applications. The deficiencies revolve around sub-optimal pre-processing, feature extraction or selection, and classification, particularly under conditions of input data variability. The joint use of audible and visible manifestations of speech aims to alleviate these shortcomings, but the development of effective combination techniques is challenging. This paper proposes and evaluates a combination approach for speaker identification based on fuzzy modelling of acoustic and visual speaker characteristics. The proposed audio-visual model has been evaluated experimentally on a speaker identification task. The results show that the joint model outperforms its isolated components in terms of identification accuracy. In particular, the cross-modal coupling of audio-visual streams is shown to improve identification accuracy.
机译:机器进行的基于语音的人识别尚未达到其某些潜在应用所要求的技术成熟水平。缺陷围绕着次优的预处理,特征提取或选择以及分类,尤其是在输入数据可变性的情况下。声音的可听和可见的表现的联合使用旨在减轻这些缺点,但是有效的组合技术的发展具有挑战性。本文提出并评估了基于声音和视觉说话者特征的模糊建模的说话人识别组合方法。拟议的视听模型已经在说话者识别任务上进行了实验评估。结果表明,在识别精度方面,联合模型优于其孤立的组件。特别地,示出了视听流的交叉模式耦合以提高识别精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号