首页> 外文会议>International Conference on Signal Processing and Communication Systems >Detection of Vowel Offset Points Using Non-Local Similarity Between Speech Samples
【24h】

Detection of Vowel Offset Points Using Non-Local Similarity Between Speech Samples

机译:使用语音样本之间的非局部相似性检测元音偏移点

获取原文
获取外文期刊封面目录资料

摘要

Automatic detection of vowels is not only an important but also a challenging problem. Vowel offset point (VEP) is the instant of ending of a vowel. Like vowel onset points (VOPs), VEPs are equally important for accurate marking of vowels and analysis of speech signal. The transition in the signal magnitude at the VEPs is quite different when compared to the VOPs. Consequently, most of the front-end features proposed for the detection of VOPs fail to detect the VEPs. Performance of the existing features also reduces significantly in the case of noisy speech signals. In this work, a robust frontend speech parametrization approach is proposed for enhancing the discrimination at the VEPs. In the proposed approach, weight values are assigned to each of the sample points by computing the similarity present in the samples belonging to two different frames within a search neighborhood. The weight values (WVs) computed from the non-local similarity (NLS) is significantly less when the frames under consideration are similar in comparison to the dissimilar ones. Since the vowels are longer regions and exhibit periodicity, there will be more similarity in the case of frames belonging to the these regions. On the other hand, the frames belonging to the non-vowel regions and noises will be dissimilar. In this work, WVs computed from the NLS is used as a feature for detecting the VEPs in a given speech signal. The proposed method is observed to outperform the deep neural network - hidden Markov model based classifier under both clean and noisy test conditions even after the inclusion of a recently proposed speech enhancement module.
机译:元音的自动检测不仅是重要的而且是具有挑战性的问题。元音偏移点(VEP)是元音结束的时刻。像元音起始点(VOP)一样,VEP对于元音的准确标记和语音信号分析同样重要。与VOP相比,VEP处信号幅度的变化非常不同。因此,建议用于检测VOP的大多数前端功能都无法检测VEP。在语音信号嘈杂的情况下,现有功能的性能也会大大降低。在这项工作中,提出了一种鲁棒的前端语音参数化方法,以增强VEP的辨别力。在提出的方法中,通过计算属于搜索邻域内两个不同帧的样本中存在的相似度,将权重值分配给每个样本点。当所考虑的帧与不相似的帧相比相似时,从非局部相似度(NLS)计算出的权重值(WV)明显更少。由于元音是较长的区域并且表现出周期性,因此在属于这些区域的帧的情况下将具有更多的相似性。另一方面,属于非元音区域的帧和噪声将是不同的。在这项工作中,从NLS计算出的WV用作检测给定语音信号中VEP的功能。在干净的和嘈杂的测试条件下,即使包含了最近提出的语音增强模块,该方法也能胜过深度神经网络-基于隐马尔可夫模型的分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号