首页> 外文会议>International Conference on Multimedia and Expo >AUDIOVISUAL-BASED ADAPTIVE SPEAKER IDENTIFICATION
【24h】

AUDIOVISUAL-BASED ADAPTIVE SPEAKER IDENTIFICATION

机译:基于视听的自适应扬声器识别

获取原文

摘要

An adaptive speaker identification system is presented in this paper, which aims to recognize speakers in feature films by exploiting both audio and visual cues. Specifically, the audio source is first analyzed to identify speakers using a likelihood-based approach. Meanwhile, the visual source is parsed to recognize talking faces using face detection/recognition and mouth tracking techniques. These two information sources are then integrated under a probabilistic framework for improved system performance. Moreover, to account for speakers' voice variations along time, we update their acoustic models on the fly by adapting to their newly contributed speech data. An average of 80% identification accuracy has been achieved on two test movies. This shows a promising future of the proposed audiovisual-based adaptive speaker identification approach.
机译:本文提出了一种自适应扬声器识别系统,其目的是通过利用音频和视觉线索来识别特征胶片中的扬声器。具体地,首先分析音频源以使用基于可能性的方法来识别扬声器。同时,解析视觉源以识别使用面部检测/识别和嘴巴跟踪技术识别谈话面。然后,这两个信息源在概率框架下集成了改进的系统性能。此外,要考虑沿时间的发言者的语音变化,我们通过适应新贡献的语音数据来更新他们的声学模型。平均在两个测试电影上实现了80%的识别准确性。这表明了建议基于视听的自适应扬声器识别方法的有希望的未来。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号