首页> 外文期刊>ACM transactions on Asian language information processing >Interruption Point Detection of Spontaneous Speech Using Inter-Syllable Boundary-Based Prosodic Features
【24h】

Interruption Point Detection of Spontaneous Speech Using Inter-Syllable Boundary-Based Prosodic Features

机译:基于音节间边界韵律特征的自发语音中断点检测

获取原文
获取原文并翻译 | 示例

摘要

This article presents a probabilistic scheme for detecting the interruption point (IP) in spontaneous speech based on inter-syllable boundary-based prosodic features. Because of the high error rate in spontaneous speech recognition, a combined acoustic model considering both syllable and subsyllable recognition units, is firstly used to determine the inter-syllable boundaries and output the recognition confidence of the input speech. Based on the finding that IPs always occur at inter-syllable boundaries, a probability distribution of the prosodic features at the current potential IP is estimated. The Conditional Random Field (CRF) model, which employs the clustered prosodic features of the current potential IP and its preceding and succeeding inter-syllable boundaries, is employed to output the IP likelihood measure. Finally, the confidence of the recognized speech, the probability distribution of the prosodic features and the CRF-based IP likelihood measure are integrated to determine the optimal IP sequence of the input spontaneous speech. In addition, pitch reset and lengthening are also applied to improve the IP detection performance. The Mandarin Conversional Dialogue Corpus is adopted for evaluation. Experimental results show that the proposed IP detection approach obtains 10.56% and 6.5% more effective results than the hidden Markov model and the Maximum Entropy model respectively under the same experimental conditions. Besides, the IP detection error rate can be further reduced by 9.15% using pitch reset and lengthening information. The experimental results confirm that the proposed model based on inter-syllable boundary-based prosodic features can effectively detect the interruption point in spontaneous Mandarin speech.
机译:本文提出了一种基于音节间基于边界的韵律特征来检测自发语音中的中断点(IP)的概率方案。由于自发语音识别的错误率很高,因此首先考虑了音节和子音节识别单元的组合声学模型来确定音节间边界,并输出输入语音的识别置信度。基于IP总是出现在音节间边界的发现,估计了当前潜在IP上韵律特征的概率分布。使用条件随机场(CRF)模型来输出IP似然测度,该模型利用当前潜在IP的聚类韵律特征及其前后音节间边界。最后,将识别出的语音的置信度,韵律特征的概率分布以及基于CRF的IP似然度量进行整合,以确定输入自发语音的最佳IP序列。另外,音调重置和加长也可用于改善IP检测性能。评估采用普通话转换对话语料库。实验结果表明,在相同的实验条件下,所提出的IP检测方法分别比隐马尔可夫模型和最大熵模型分别获得了10.56%和6.5%的有效结果。此外,利用音调重置和加长信息,可以将IP检测错误率进一步降低9.15%。实验结果证明,基于音节间基于边界的韵律特征的模型可以有效地检测自发普通话中的中断点。

著录项

  • 来源
  • 作者单位

    Department of Computer Science and Information Engineering, National Cheng Rung University, Tainan, Taiwan, 701;

    Department of Computer Science and Information Engineering, National Cheng Rung University, Tainan, Taiwan, 701;

    Department of Computer Science and Information Engineering, National Chiayi University, Chiayi, Taiwan, 600;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号