...
首页> 外文期刊>EURASIP journal on audio, speech, and music processing >Segment boundary detection directed attention for online end-to-end speech recognition
【24h】

Segment boundary detection directed attention for online end-to-end speech recognition

机译:段边界检测在线端到端语音识别的指导关注

获取原文
           

摘要

Attention-based encoder-decoder models have recently shown competitive performance for automatic speechrecognition (ASR) compared to conventional ASR systems. However, how to employ attention models for onlinespeech recognition still needs to be explored. Different from conventional attention models wherein the softalignment is obtained by a pass over the entire input sequence, attention models for online recognition must learnonline alignment to attend part of input sequence monotonically when generating output symbols. Based on the factthat every output symbol is corresponding to a segment of input sequence, we propose a new attention mechanismfor learning online alignment by decomposing the conventional alignment into two parts: segmentation—segmentboundary detection with hard decision—and segment-directed attention—information aggregation within thesegment with soft attention. The boundary detection is conducted along the time axis from left to right, and a decisionis made for each input frame about whether it is a segment boundary or not. When a boundary is detected, thedecoder generates an output symbol by attending the inputs within the corresponding segment. With the proposedattention mechanism, online speech recognition can be realized. The experimental results on TIMIT and WSJ datasetshow that our proposed attention mechanism achieves comparable online performance with state-of-the-art models.
机译:与传统ASR系统相比,基于关注的编码器 - 解码器模型最近显示了自动演示识别(ASR)的竞争性能。但是,需要如何探索如何使用onlinesPeech识别的注意模型。与传统的注意模型不同,其中通过通过整个输入序列获得软启动,在线识别的注意力模型必须在生成输出符号时单调地参加一部分输入序列的校准。基于每个输出符号对应于输入序列的段,我们提出了通过将传统对准分解为两部分的传统对准来学习在线对齐的新关注机制:分段 - 分段界面检测,具有硬决策和段定向的关注信息聚集在Sepentment中有着柔和的关注。边界检测沿着时间轴从左到右进行,以及对每个输入帧做出的决定,关于它是否是段边界。当检测到边界时,通过参加相应段内的输入来产生输出符号。通过拟议注意力机制,可以实现在线语音识别。 Timit和WSJ Datasetshow的实验结果,我们提出的注意机制实现了与最先进的模型相当的在线表现。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号