首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Context-Aware Prosody Correction for Text-Based Speech Editing
【24h】

Context-Aware Prosody Correction for Text-Based Speech Editing

机译:基于文本的语音编辑的背景感知韵律校正

获取原文

摘要

Text-based speech editors expedite the process of editing speech recordings by permitting editing via intuitive cut, copy, and paste operations on a speech transcript. A major drawback of current systems, however, is that edited recordings often sound unnatural because of prosody mismatches around edited regions. In our work, we propose a new context-aware method for more natural sounding text-based editing of speech. To do so, we 1) use a series of neural networks to generate salient prosody features that are dependent on the prosody of speech surrounding the edit and amenable to fine-grained user control 2) use the generated features to control a standard pitch-shift and time-stretch method and 3) apply a denoising neural network to remove artifacts induced by the signal manipulation to yield a high-fidelity result. We evaluate our approach using a subjective listening test, provide a detailed comparative analysis, and conclude several interesting insights.
机译:基于文本的语音编辑器通过在语音成绩单上的直观切割,复制和粘贴操作允许编辑来加快编辑语音录制的过程。 然而,目前系统的主要缺点是编辑的录音经常听起来是不自然的,因为韵律围绕着编码的区域不匹配。 在我们的工作中,我们提出了一种新的背景感知方法,了解更多基于自然的发言的语音编辑。 为此,我们1)使用一系列神经网络来产生突出的韵律特征,这些特征取决于围绕编辑播种的韵律,并适用于细粒度的用户控制2)使用所生成的特征来控制标准的音高转换 和时间拉伸方法和3)应用去噪神经网络以去除由信号操纵引起的伪像以产生高保真结果。 我们使用主观听力测试评估我们的方法,提供详细的比较分析,并得出几个有趣的见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号