首页> 外文会议>IEEE International Conference on Systems, Man, and Cybernetics >Performance Analysis of Distributed Speech Recognition Using Analysis-by-Synthesis Frame Reduced Front End under Packet Loss Conditions
【24h】

Performance Analysis of Distributed Speech Recognition Using Analysis-by-Synthesis Frame Reduced Front End under Packet Loss Conditions

机译:分布式语音识别在分组丢失条件下使用分析框架减少前端的分布式语音识别性能分析

获取原文

摘要

We proposed an analysis-by-synthesis (AbS) frame dropping algorithm for the front end of a distributed speech recognition (DSR) system that preserves rapidly changing frames for being more related to speech perception but discards slowly changing frames for providing little information. When applying DSR over error prone packet-switched networks, speech data will inevitably suffer from frame loss since packets may be lost or delayed due to congestion at routers. We further employed a model adaptation error concealment decoder at the back-end for compensating the mismatch between the pre-trained models and the test data, which contain missing frames caused by frame dropping at the front end and packet loss over the transmitted channel. This approach, for convenience, is denoted as AbS-MA. In the decoding process of AbS-MA, the transition probabilities of the hidden Markov models are dynamically adapted according to the time difference between successive observations. Experiments on the recognition of Mandarin digits were conducted to investigate the effectiveness of the proposed AbS-MA method for a wide range of combinations of frame rates and packet loss conditions. The performance of the proposed AbS-MA approach was compared with a baseline approach, in which the error concealment was implemented by an interpolation as the estimate of the missing frame of the received observations at the back-end. The experimental results show that AbS-MA is not only superior to the baseline in word accuracy but also significantly reduces the computation time.
机译:我们提出了一种逐合作(ABS)帧丢弃算法,用于分布式语音识别(DSR)系统的前端,其保留快速改变的帧,以便与语音感知更相关,但丢弃缓慢改变的帧以提供很少的信息。当应用DSR过度易于分组交换网络时,语音数据将不可避免地遭受帧损失,因为由于在路由器处的拥塞而可能丢失或延迟。我们进一步使用了后端的模型适配错误隐藏解码器,用于补偿预先训练的模型和测试数据之间的不匹配,该测试数据包含由在前端的帧丢失和传输信道上的分组丢失引起的丢失帧。为方便起见,这种方法表示为ABS-MA。在ABS-MA的解码过程中,根据连续观察之间的时间差,隐马尔可夫模型的转换概率动态地调整。进行了关于识别普通话数字的实验,研究了拟议的ABS-MA方法在广泛的帧速率和零件损失条件的组合中的有效性。将所提出的ABS-MA方法的性能与基线方法进行比较,其中错误隐藏是通过插值来实现的,作为后端接受观察框架的估计。实验结果表明,ABS-MA不仅优于字心精度,而且还显着降低了计算时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号