首页> 外文会议>FISITA world automotive congress >Barge-in Implementation Method for Multi-CPU In-Vehicle Speech Recognition System
【24h】

Barge-in Implementation Method for Multi-CPU In-Vehicle Speech Recognition System

机译:多CPU车载语音识别系统的插入实现方法

获取原文

摘要

The objective is to implement barge-in function for multi-CPU in-vehicle speech recognition system. The barge-in can allow user utterances during the guidance prompt from the system. Barge-in function requires two input audio data. The one is speech signal from microphone. The other is the original guidance prompt data. There are two major problems to implement barge-in. The one is sampling frequency synchronization, because the frequencies are usually different between two input audio data. The other is input timing synchronization, because the timing gap between two input audio data must be within plus or minus 7 ms based on the specification. We have adopted digital hardware converter for sampling frequency conversion. Hardware processing makes much smaller signal input delay than software processing. In addition, digital processing keeps the performance of barge-in compared to analog processing, because it does not cause the quality deterioration. We store the reference data right before audio data input for barge-in module, because the variabilities by the reference data transmission between two CPUs can be removed. Regarding the microphone input data, we reduce the variabilities by synchronizing the microphone input request with the guidance prompt play request. The guidance prompt taken by microphone is passed into barge-in module with the reference data one by one in the smallest unit sequentially to keep real-time processing. The implementation method was validated in terms of the design. The presented design has been implemented and evaluated. Conclusively, it is validated based on the evaluation in terms of the input timing gap between two input audio data into barge-in module and the performance of barge-in function. This study was specified for multi-CPU in-vehicle speech recognition system. The prerequisite is that the full reference data must be on memory to keep real-time processing. Our method is able to solve the two major problems by hardware sampling frequency conversion and the proposed input timing synchronization method. In addition, this method keeps the real-time processing of speech recognition even if we added the barge-in function as the preprocessing of speech recognition. We proposed a barge-in function implementation method for multi-CPU in-vehicle speech recognition system. It solves problems regarding sampling frequency and input timing synchronization of two input audio data to adopt barge-in function. This method does not make any latency performance issues of speech recognition.
机译:目的是实现多CPU车载语音识别系统的插入功能。插入功能可以在系统的提示提示期间允许用户讲话。插入功能需要两个输入音频数据。一个是来自麦克风的语音信号。另一个是原始指导提示数据。实现插入功能有两个主要问题。一种是采样频率同步,因为两个输入音频数据之间的频率通常不同。另一个是输入时序同步,因为根据规范,两个输入音频数据之间的时序间隙必须在正负7毫秒之内。我们采用数字硬件转换器进行采样频率转换。与软件处理相比,硬件处理使信号输入延迟小得多。此外,与模拟处理相比,数字处理可保持插入效果,因为它不会导致质量下降。我们将参考数据存储在插入模块的音频数据输入之前,因为可以消除两个CPU之间参考数据传输的差异。关于麦克风输入数据,我们通过将麦克风输入请求与指导提示播放请求同步来减少差异。麦克风接收到的引导提示将以最小单位依次与参考数据一道传递到插入模块中,以保持实时处理。实施方法在设计方面得到了验证。提出的设计已经实施和评估。最后,基于对插入到插入模块中的两个输入音频数据之间的输入时间间隔以及插入功能的性能的评估,对它进行了验证。这项研究被指定用于多CPU车载语音识别系统。前提条件是完整的参考数据必须在内存中才能保持实时处理。通过硬件采样频率转换和提出的输入时序同步方法,我们的方法能够解决两个主要问题。此外,即使我们添加了插入功能作为语音识别的预处理,该方法仍可以实时处理语音识别。我们提出了一种用于多CPU车载语音识别系统的插入功能实现方法。它解决了有关两个输入音频数据的采样频率和输入时序同步的问题,从而采用插入功能。此方法不会引起语音识别的任何延迟性能问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号