首页> 外国专利> Speech analysis synthesis system and method thereof having the energy normalization and unvoiced frame suppression function

Speech analysis synthesis system and method thereof having the energy normalization and unvoiced frame suppression function

机译:具有能量归一化和清音抑制功能的语音分析综合系统及其方法

摘要

Energy normalization in speech synthesis systems is achieved by a look-ahead adaptive normalization procedure, wherein energy is adaptively tracked, and the adaptive energy-tracking value is used to normalize a much earlier frame's energy value. In another aspect, silence suppression in speech synthesis systems is achieved by detecting and processing only segments of voice activity. A segment is classified as "speech" if the energy of the signal is greater than an adaptively adjusted threshold. The adaptively adjusted threshold is preferably defined as the maximum of scaled values of two separate envelope parameters, which both track the variation in energy over the sequence of frames of speech data. One contour is a slow-rising fast-falling value, which is updated only during unvoiced speech frames, and therefore tracks a lower envelope of the energy contour. This parameter in effect tracks an ambient noise level. The other parameter is a fast-rising slow-falling parameter, which is updated only during voiced speech frames, and thus tracks an upper envelope of the energy contour. (This in effect tracks the average speech level.)
机译:语音合成系统中的能量归一化是通过一种前瞻性自适应归一化过程来实现的,其中能量被自适应地跟踪,并且自适应能量跟踪值被用来归一化更早帧的能量值。在另一方面,通过仅检测和处理语音活动的片段来实现语音合成系统中的静音抑制。如果信号的能量大于自适应调整的阈值,则将片段分类为“语音”。自适应地调整的阈值优选地被定义为两个单独的包络参数的比例值的最大值,这两个包络参数都跟踪语音数据的帧序列上的能量变化。一个轮廓是缓慢上升的快速下降的值,该值仅在不发声的语音帧期间更新,因此跟踪能量轮廓的较低包络线。该参数实际上跟踪环境噪声水平。另一个参数是快速上升的缓慢下降的参数,该参数仅在浊音帧期间更新,因此跟踪能量轮廓的上包络线。 (这实际上跟踪平均语音水平。)

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号