Context-Aware Prosody Correction for Text-Based Speech Editing

机译：基于文本的语音编辑的背景感知韵律校正

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text-based speech editors expedite the process of editing speech recordings by permitting editing via intuitive cut, copy, and paste operations on a speech transcript. A major drawback of current systems, however, is that edited recordings often sound unnatural because of prosody mismatches around edited regions. In our work, we propose a new context-aware method for more natural sounding text-based editing of speech. To do so, we 1) use a series of neural networks to generate salient prosody features that are dependent on the prosody of speech surrounding the edit and amenable to fine-grained user control 2) use the generated features to control a standard pitch-shift and time-stretch method and 3) apply a denoising neural network to remove artifacts induced by the signal manipulation to yield a high-fidelity result. We evaluate our approach using a subjective listening test, provide a detailed comparative analysis, and conclude several interesting insights.

机译：基于文本的语音编辑器通过在语音成绩单上的直观切割，复制和粘贴操作允许编辑来加快编辑语音录制的过程。然而，目前系统的主要缺点是编辑的录音经常听起来是不自然的，因为韵律围绕着编码的区域不匹配。在我们的工作中，我们提出了一种新的背景感知方法，了解更多基于自然的发言的语音编辑。为此，我们1）使用一系列神经网络来产生突出的韵律特征，这些特征取决于围绕编辑播种的韵律，并适用于细粒度的用户控制2）使用所生成的特征来控制标准的音高转换和时间拉伸方法和3）应用去噪神经网络以去除由信号操纵引起的伪像以产生高保真结果。我们使用主观听力测试评估我们的方法，提供详细的比较分析，并得出几个有趣的见解。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2021年|7038-7042|共5页
会议地点
作者
Max Morrison; Lucas Rencker; Zeyu Jin; Nicholas J. Bryan; Juan-Pablo Caceres; Bryan Pardo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Vocoders; Conferences; Neural networks; Noise reduction; Production; Signal processing; Real-time systems;

机译：声码;会议;神经网络;降噪;生产;信号处理;实时系统;

相似文献

外文文献
中文文献
专利

1. Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction [J] . Mohsen Maraoui, Naim Terbeh, Mounir Zrigui International journal of speech technology . 2018,第4期

机译：基于声学，韵律和语音模型的阿拉伯语话语分析：言语评价，语音分类和病理性语音校正
2. Prosody Correction Preserving Speaker Individuality for Chinese-Accented Japanese HMM-Based Text-to-Speech Synthesis [J] . Daiki SEKIZAWA, Shinnosuke TAKAMICHI, Hiroshi SARUWATARI IEICE transactions on information and systems . 2019,第6期

机译：基于汉字的日本HMM语音合成中保留韵律校正的说话人个性
3. Non-Native Text-to-Speech Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics [J] . Yuji OSHIMA, Shinnosuke TAKAMICHI, Tomoki TODA, IEICE transactions on information and systems . 2016,第12期

机译：基于部分音韵和语音特征校正的非母语文本到语音保留说话人个性
4. Prosodic Word-Based Error Correction in Speech Recognition Using Prosodic Word Expansion and Contextual Information [C] . Chao-Hong Liu, Chung-Hsien Wu Annual conference of the International Speech Communication Association;INTERSPEECH 2010 . 2011

机译：基于韵律词扩展和上下文信息的语音识别中的基于韵律词的纠错
5. Speech Synthesis for Text-Based Editing of Audio Narration [D] . Jin, Zeyu. 2018

机译：基于文本的音频旁白编辑的语音合成
6. The Prosodic Marionette: a method to visualize speech prosody and assess perceptual and expressive prosodic abilities [O] . Jonathan S. Brumberg, Jill C. Thorson, Rupal Patel -1

机译：韵律木偶：一种可视化语音韵律并评估感知和表达韵律能力的方法
7. The Japanese speech synthesis system with text editing and automatic prosodic control facilities [O] . Seiichi Yamamoto, Norio Higuchi, Tohru Shimizu 1988

机译：具有文本编辑和自动韵律控制设施的日本语音合成系统

Context-Aware Prosody Correction for Text-Based Speech Editing

摘要

著录项

相似文献

相关主题

期刊订阅