首页> 外文学位 >A formant-based linear prediction speech synthesis/analysis system.
【24h】

A formant-based linear prediction speech synthesis/analysis system.

机译:基于共振峰的线性预测语音合成/分析系统。

获取原文
获取原文并翻译 | 示例

摘要

The aim of this research was to develop a speech synthesis/analysis system as the framework for generating high-fidelity synthetic speech and for psychoacoustic studies. A formant-based linear prediction (LP) synthesizer, along with a robust speech analysis procedure, was developed to achieve this aim. The major feature of this system is its ability to adapt the formant and linear prediction schemes to represent the voiced and unvoiced sounds, respectively. The advantages of employing two kinds of schemes in one synthesis system are (1) the formant scheme is physically meaningful for simulating the human speech production system, and (2) the LP scheme is able to reproduce the spectrum of all speech sounds.; The formant-based LP synthesizer uses two types of sources, voiced and unvoiced, to form the excitation part of the synthesizer. These sources are either nonparametric waveforms or parametric models of waveforms. The vocal tract is characterized by a twelfth order linear prediction filter. For voiced sounds, the coefficients of the vocal tract filter are determined by the first six formants. The counterparts for unvoiced sounds are obtained by means of a twelfth order LP analysis. This synthesizer can resynthesize speech almost perfectly when the estimated glottal waveform from a glottal inverse filtering process is used as the excitation source. When the modeled waveform is used as the excitation source, the synthesized speech is natural and intelligible.; The other feature of this research is that the interaction between the synthesis and analysis is closely defined. A two-phase, LP-based procedure that analyzes a segment of the speech signal was developed to estimate the time-varying synthesis parameters such as the voiced/unvoiced classification, fundamental frequency, signal power, formants (for voiced sounds), LP coefficients (for unvoiced sounds), and the estimated glottal waveforms to the formant-based LP synthesizer.; Based on the synthesis and analysis procedures, as well as a knowledge of the relationships between vocal quality and glottal features, a voice conversion procedure that reproduces the vocal tract component, but varies the glottal features, was developed to convert a segment of the speech signal of modal voice type to five other voice types (vocal fry, breathy, falsetto, whisper, and harsh). The conversion procedure provides a systematic method for examining the relationships between vocal quality and glottal features, and can be used to build a data base for various voice types, which can be used in training a speech recognition system.; In addition to the glottal source parameters, the vocal tract parameters can be manipulated by our synthesis/analysis system as well. Since the features of the glottal source and the vocal tract are both involved in speech studies such as gender conversion, speaker identification, and speech recognition, this synthesis/analysis system can serve as a tool for future applications.
机译:这项研究的目的是开发一种语音合成/分析系统,作为生成高保真合成语音和进行心理声学研究的框架。基于共振峰的线性预测(LP)合成器以及强大的语音分析程序已被开发出来,以实现这一目标。该系统的主要特征是其能够适应共振峰和线性预测方案以分别表示浊音和清音的能力。在一个合成系统中采用两种方案的优点是:(1)共振峰方案对于模拟人类语音产生系统具有物理意义;(2)LP方案能够再现所有语音的频谱。基于共振峰的LP合成器使用两种类型的有声和无声声源来构成合成器的激励部分。这些源是非参数波形或波形的参数模型。声道的特征在于十二阶线性预测滤波器。对于浊音,声道滤波器的系数由前六个共振峰确定。通过十二阶LP分析获得对应的清音。当使用声门逆滤波过程估计的声门波形用作激励源时,该合成器几乎可以完美地重新合成语音。当将建模波形用作激励源时,合成语音自然且可理解。这项研究的另一个特点是,合成和分析之间的相互作用是紧密定义的。开发了一种两阶段的,基于LP的过程来分析语音信号的一部分,以估计时变合成参数,例如浊音/清音分类,基频,信号功率,共振峰(用于浊音),LP系数(用于清音),以及估计的声门波形到基于共振峰的LP合成器。基于合成和分析程序,以及对声音质量和声门特征之间关系的了解,开发了一种语音转换程序,该程序可复制声道成分,但会改变声门特征,以转换语音信号的一部分模态语音类型转换为其他五种语音类型(混音,呼吸,假音,耳语和刺耳)。转换程序提供了一种检查声音质量与声门特征之间关系的系统方法,并可用于建立各种语音类型的数据库,可用于训练语音识别系统。除了声门源参数外,声道参数也可以通过我们的合成/分析系统进行操作。由于声门来源和声道的特征都参与了语音研究,例如性别转换,说话者识别和语音识别,因此该合成/分析系统可以用作未来应用的工具。

著录项

  • 作者

    Shue, Yean-Jen.;

  • 作者单位

    University of Florida.;

  • 授予单位 University of Florida.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 1995
  • 页码 163 p.
  • 总页数 163
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

  • 入库时间 2022-08-17 11:49:39

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号