首页> 外文期刊>Journal of voice: official journal of the Voice Foundation >Bio-inspired evolutionary oral tract shape modeling for physical modeling vocal synthesis.
【24h】

Bio-inspired evolutionary oral tract shape modeling for physical modeling vocal synthesis.

机译:以生物为灵感的进化口腔形状模型,用于对人声合成进行物理建模。

获取原文
获取原文并翻译 | 示例
       

摘要

Physical modeling using digital waveguide mesh (DWM) models is an audio synthesis method that has been shown to produce an acoustic output in music synthesis applications that is often described as being "organic," "warm," or "intimate." This paper describes work that takes its inspiration from physical modeling music synthesis and applies it to speech synthesis through a physical modeling mesh model of the human oral tract. Oral tract shapes are found using a computational technique based on the principles of biological evolution. Essential to successful speech synthesis using this method is accurate measurements of the cross-sectional area of the human oral tract, and these are usually derived from magnetic resonance imaging (MRI). However, such images are nonideal, because of the lengthy exposure time (relative to the time of articulation of speech sounds) required, the local ambient acoustic noise associated with the MRI machine itself and the required supine position for the subject. An alternative method is described where a bio-inspired computing technique that simulates the process of evolution is used to evolve oral tract shapes. This technique is able to produce appropriate oral tract shapes for open vowels using acoustic and excitation data from two adult males and two adult females, but shapes for close vowels that are less appropriate. This technique has none of the drawbacks associated with MRI, because all it requires from the subject is an acoustic and electrolaryngograph (or electroglottograph) recording. Appropriate oral tract shapes do enable the model to produce excellent quality synthetic speech for vowel sounds, and sounds that involve dynamic oral tract shape changes, such as diphthongs, can also be synthesized using an impedance mapped technique. Efforts to improve performance by reducing mesh quantization for close vowels had little effect, and further work is required.
机译:使用数字波导网格(DWM)模型的物理建模是一种音频合成方法,已被证明在音乐合成应用程序中产生声音输出,通常被描述为“有机”,“温暖”或“亲密”。本文介绍了从物理建模音乐合成中汲取灵感的作品,并通过人类口腔的物理建模网格模型将其应用于语音合成。使用基于生物进化原理的计算技术可以找到口腔形状。使用此方法成功进行语音合成的关键是准确测量人的口腔横截面积,这些通常来自磁共振成像(MRI)。但是,由于所需的曝光时间长(相对于语音的发音时间),与MRI机器本身相关的局部环境噪声以及受试者所需的仰卧位,这些图像是不理想的。描述了一种替代方法,其中使用了模拟进化过程的生物启发式计算技术来进化口腔形状。使用来自两个成年男性和两个成年女性的声学和激发数据,该技术能够为开放元音生成合适的口腔形状,但封闭元音的形状不太合适。这项技术没有与MRI相关的缺点,因为从受试者身上获得的所有要求就是声学和电喉描记器(或电声描记器)记录。适当的口腔形状确实可以使模型产生高质量的元音合成语音,而且涉及动态口腔形状变化的声音(例如双音)也可以使用阻抗映射技术进行合成。通过减少紧密元音的网格量化来提高性能的努力几乎没有效果,需要进一步的工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号