...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Multi-Frame Amplitude Envelope Estimation for Modification of Singing Voice
【24h】

Multi-Frame Amplitude Envelope Estimation for Modification of Singing Voice

机译:歌唱语音修改的多帧幅度包络估计

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Singing voice synthesis benefits from very high quality estimation of the resonances and anti-resonances of the vocal tract filter (VTF), i.e., an amplitude spectral envelope. In the state of the art, a single frame of DFT transform is commonly used as a basis for building spectral envelopes. Even though multiple frame analysis (MFA) has already been suggested for envelope estimation, it is not yet used in concrete applications. Indeed, even though existing attempts have shown very interesting results, we will demonstrate that they are either over complicated or fail to satisfy the high accuracy that is necessary for singing voice. In order to allow future applications of MFA, this article aims to improve the theoretical understanding and advantages of MFA-based methods. The use of singing voice signals is very beneficial for studying MFA methods due to the fact that the VTF configuration can be relatively stable and, at the same time, the vibrato creates a regular variation that is easy to model. By simplifying and extending previous works, we also suggest and describe two MFA-based methods. To better understand the behaviors of the envelope estimates, we designed numerical measurements to assess single frame analysis and MFA methods using synthetic signals. With listening tests, we also designed two proofs of concept using pitch scaling and conversion of timbre. Both evaluations show clear and positive results for MFA-based methods, thus, encouraging this research direction for future applications.
机译:唱歌语音合成受益于声道滤波器(VTF)的共振和反共振的高质量估计,即幅度频谱包络。在现有技术中,DFT变换的单个帧通常被用作构建频谱包络的​​基础。尽管已经建议使用多帧分析(MFA)进行包络估计,但尚未在具体应用中使用它。确实,即使现有的尝试显示出非常有趣的结果,我们也将证明它们要么过于复杂,要么无法满足歌唱所需的高精度。为了允许将来的MFA应用,本文旨在提高对基于MFA的方法的理论理解和优势。唱歌声信号的使用对于研究MFA方法非常有益,因为VTF配置可以相对稳定,同时颤音会创建易于建模的规则变化。通过简化和扩展以前的工作,我们还建议和描述了两种基于MFA的方法。为了更好地了解包络估计的行为,我们设计了数值测量,以使用合成信号评估单帧分析和MFA方法。通过听力测试,我们还设计了两个使用音调缩放和音色转换的概念证明。两种评估均显示了基于MFA的方法的明确和积极的结果,因此,鼓励了该研究方向的未来应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号