首页> 外文会议>International Conference on Text, Speech and Dialogue >Phonetic Segmentation Using Knowledge from Visual and Perceptual Domain
【24h】

Phonetic Segmentation Using Knowledge from Visual and Perceptual Domain

机译:使用视觉和感知域的知识进行语音分割

获取原文

摘要

Accurate and automatic phonetic segmentation is crucial for several speech based applications such as phone level articulation analysis and error detection, speech synthesis, annotation, speech recognition and emotion recognition. In this paper we examine the effectiveness of using visual features obtained by processing the image spectrogram of a speech utterance, as applied to phonetic segmentation. Further, we propose a mechanism to combine the knowledge from visual and perceptual domains for automatic phonetic segmentation. This process can be considered analogous to manual phonetic segmentation. The technique was evaluated on TIMIT American English Corpus. Experimental results show significant improvements in phonetic segmentation, especially for lower tolerances of 5, 10 and 15ms, with an absolute improvement of 8.29% for TIMIT database for a 10ms tolerance is observed.
机译:准确和自动语音分割对于几个语音基于语音的应用至关重要,例如电话级铰接分析和错误检测,语音合成,注释,语音识别和情感识别。在本文中,我们研究了通过处理语音话语的图像谱图所获得的可视特征的有效性,如图所应用的语音分割。此外,我们提出了一种将知识与自动语音分割的视觉和感知域中的知识结合起来。该过程可以被认为是类似于手动语音分割。该技术是对跨国英语语料库评估的。实验结果显示了语音分割的显着改善,特别是对于5,10和15ms的较低公差,观察到10ms公差的速度数据库的绝对改善为8.29%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号