首页> 外文期刊>Journal of signal processing systems for signal, image, and video technology >Adaptive Reliability Measure and Optimum Integration Weight for Decision Fusion Audio-visual Speech Recognition
【24h】

Adaptive Reliability Measure and Optimum Integration Weight for Decision Fusion Audio-visual Speech Recognition

机译:决策融合视听语音识别的自适应可靠性度量和最佳集成权

获取原文
获取原文并翻译 | 示例
           

摘要

Audio-visual speech recognition (AVSR) using acoustic and visual signals of speech has received attention recently because of its robustness in noisy environments. An important issue in decision fusion based AVSR system is the determination of appropriate integration weight for the speech modalities to integrate and ensure better performance under various SNR conditions. Generally, the integration weight is calculated from the relative reliability of two modalities. This paper investigates the effect of reliability measure on integration weight estimation and proposes a genetic algorithm (GA) based reliability measure which uses optimum number of best recognition hypotheses rather than N best recognition hypotheses to determine an appropriate integration weight. Further improvement in recognition accuracy is achieved by optimizing the above measured integration weight by genetic algorithm. The performance of the proposed integration weight estimation scheme is demonstrated for isolated word recognition (incorporating commonly used functions in mobile phones) via multi-speaker database experiment. The results show that the proposed schemes improve robust recognition accuracy over the conventional unimodal systems, and a couple of related existing bimodal systems, namely, the baseline reliability ratio-based system and N best recognition hypotheses reliability ratio-based system under various SNR conditions.
机译:最近,由于其在嘈杂环境中的鲁棒性,使用语音的声音和视觉信号的视听语音识别(AVSR)受到了关注。基于决策融合的AVSR系统中的一个重要问题是确定语音模态的适当集成权重,以在各种SNR条件下进行集成并确保更好的性能。通常,积分权重是根据两种模态的相对可靠性计算得出的。本文研究了可靠性测度对积分权重估计的影响,并提出了一种基于遗传算法(GA)的可靠性测度,该方法使用最佳数量的最佳识别假设而不是N个最佳识别假设来确定合适的积分权重。通过遗传算法优化上述测得的积分权重,可以进一步提高识别精度。通过多说话者数据库实验,证明了所提出的积分权重估计方案的性能,用于孤立单词识别(结合了手机中常用的功能)。结果表明,所提出的方案与常规的单峰系统相比,在各种信噪比条件下,已有几种相关的双峰系统,即基于基线可靠性比的系统和基于N最佳识别假设的基于可靠性比的系统,提高了鲁棒的识别精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号