首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Reconstruction of pitch for whisper-to-speech conversion of Chinese
【24h】

Reconstruction of pitch for whisper-to-speech conversion of Chinese

机译:中国人的耳语与语音转换的音高重建

获取原文

摘要

Whispers are a common and necessary secondary vocal communications mechanism for natural human-to-human dialogue. They are also the primary communications mechanism for many suffering from aphonia, such as laryngectomees. For typical speakers, whispering is a predominantly contextual activity, prompted by either the sensitive nature of information being conveyed or in response to environmental considerations. Given the importance of whispers, especially for tonal languages like Chinese, and the fact that many communications systems assume vocalised speech, much work has been directed towards the conversion of whispers into natural sounding speech. Since pitch information is largely absent in whispers, it is this key f0 information which needs to be supplied during the regeneration process, and which is the focus of much research. GMM-based reconstruction techniques have proven effective at whisper reconstruction, and some recent work has proposed the use of artificial pitch derived from formant harmonics as an alternative. This paper describes a new formulation of the formant-harmonic f0 method, and compares this directly against a novel GMM-based f0 estimator, as well as known correct pitch excitation for parallel utterances.
机译:耳语是一种常见而必要的二级声音通信机制,用于自然人对话。它们也是许多患有神经症的主要通信机制,例如喉部。对于典型的扬声器,窃窃私语是一种主要的上下文活动,由所传达的信息的敏感性或响应环境考虑而提示。鉴于窃窃私语的重要性,特别是对于汉语这样的色调语言,以及许多通信系统假设声音讲话的事实,就朝着悄悄话转换为自然发声言论而言。由于俯视者,俯仰信息很大程度上不存在,因此在再生过程中需要提供的关键F0信息,并且是多么研究的重点。基于GMM的重建技术已被证明在耳语重建时有效,并且一些最近的工作提出了使用从格式谐波的人工间距作为替代方案。本文介绍了格式谐波F0方法的新配方,并将其直接与基于新的GMM的F0估计器进行比较,以及已知的平行话语的正确俯仰激发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号