首页> 外文期刊>Neurocomputing >Environmental robust speech and speaker recognition through multi-channel histogram equalization
【24h】

Environmental robust speech and speaker recognition through multi-channel histogram equalization

机译:通过多通道直方图均衡化实现环境鲁棒的语音和说话人识别

获取原文
获取原文并翻译 | 示例

摘要

Feature statistics normalization in the cepstral domain is one of the most performing approaches for robust automaticspeech and speaker recognition in noisy acoustic scenarios: feature coefficients are normalized by using suitable linear or nonlinear transformations in order to match the noisy speech statistics to the clean speech one. Histogram equalization (HEQ) belongs to such a category of algorithms and has proved to be effective on purpose and therefore taken here as reference.In this paper the presence of multi-channel acoustic channels is used to enhance the statistics modeling capabilities of the HEQ algorithm, by exploiting the availability of multiple noisy speech occurrences, with the aim of maximizing the effectiveness of the cepstra normalization process. Computer simulations based on the Aurora 2 database in speech and speaker recognition scenarios have shown that a significant recognition improvement with respect to the single-channel counterpart and other multi-channel techniques can be achieved confirming the effectiveness of the idea. The proposed algorithmic configuration has also been combined with the kernel estimation technique in order to further improve the speech recognition performances.
机译:倒频谱域中的特征统计量归一化是在嘈杂的声学场景中进行鲁棒性自动语音和说话人识别的最有效方法之一:通过使用合适的线性或非线性变换对特征系数进行归一化,以使嘈杂的语音统计信息与干净的语音相匹配。直方图均衡(HEQ)属于此类算法,已被证明是有效的,因此在此可作为参考。本文使用多通道声通道来增强HEQ算法的统计建模能力通过利用多个嘈杂的语音事件的可用性,以最大化倒谱归一化过程的有效性。在语音和说话者识别场景中基于Aurora 2数据库的计算机仿真表明,相对于单通道对应项和其他多通道技术,可以实现显着的识别改进,从而确认了该思想的有效性。所提出的算法配置也已与内核估计技术相结合,以进一步提高语音识别性能。

著录项

  • 来源
    《Neurocomputing》 |2012年第1期|p.111-120|共10页
  • 作者单位

    MediaLabs, Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche 1. 60131, Ancona, Italy;

    MediaLabs, Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche 1. 60131, Ancona, Italy;

    MediaLabs, Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche 1. 60131, Ancona, Italy;

    MediaLabs, Department of Information Engineering, Universita Politecnica delle Marche, Via Brecce Bianche 1. 60131, Ancona, Italy;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    multi-channel audio processing; feature statistics normalization; histogram equalization; speech recognition; speaker recognition;

    机译:多通道音频处理;特征统计归一化;直方图均衡化;语音识别;说话人识别;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号