首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP >Easy does it: Robust spectro-temporal many-stream ASR without fine tuning streams
【24h】

Easy does it: Robust spectro-temporal many-stream ASR without fine tuning streams

机译:简单易行:强大的频谱时间多流ASR,无需微调流

获取原文

摘要

Previous work has shown that spectro-temporal features reduce the word error rate for automatic speech recognition under noisy conditions. These systems, however, required significant hand-tuning in order to determine which spectral and temporal modulations should be included in a particular stream. In this work, streams are split into one spectral and temporal modulation each and their posterior probabilities are combined once each stream is discriminatively trained via multilayer perceptron. We show that this combination structure performs as well or better than more elaborate methods in which multiple spectral and temporal modulations are hand-picked per stream. In addition, these type of features outperform standard noise-robust features such as the “Advanced Front End” features, whereas our hand-picked spectro-temporal features do not.
机译:先前的工作表明,时空特征可以降低嘈杂条件下自动语音识别的单词错误率。但是,这些系统需要进行大量的手动调整,才能确定特定流中应包含哪些频谱和时间调制。在这项工作中,将流分别分成一个频谱和时间调制,一旦通过多层感知器对每个流进行判别式训练,它们的后验概率就会合并在一起。我们表明,这种组合结构的性能比精巧的方法好或更好,在精巧的方法中,每个流均会手动选择多个频谱和时间调制。此外,这些类型的功能要优于标准的抗噪功能,例如“高级前端”功能,而我们手工挑选的光谱时功能则不然。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号