Fastpitch: Parallel Text-to-Speech with Pitch Prediction

机译：FastPitch：与音高预测的并行文本与语音

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present FastPitch, a fully-parallel text-to-speech model based on FastSpeech, conditioned on fundamental frequency contours. The model predicts pitch contours during inference. By altering these predictions, the generated speech can be more expressive, better match the semantic of the utterance, and in the end more engaging to the listener. Uniformly increasing or decreasing pitch with FastPitch generates speech that resembles the voluntary modulation of voice. Conditioning on frequency contours improves the overall quality of synthesized speech, making it comparable to state-of-the-art. It does not introduce an overhead, and FastPitch retains the favorable, fully-parallel Transformer architecture, with over 900× real-time factor for mel-spectrogram synthesis of a typical utterance.

机译：我们呈现FastPitch，一个基于FastSeech的全平行文本到语音模型，调节基频轮廓。该模型在推理期间预测音调轮廓。通过改变这些预测，产生的语音可以更加表征，更好地匹配话语的语义，并且在最终中更加接合到听众。用FastPitch统一增加或减少间距会产生类似于语音的自愿调制的语音。频率轮廓上的调节提高了合成语音的整体质量，使其与最先进的言论相当。它不会引入开销，FastPitch保留有利，完全平行的变压器架构，具有超过900倍的实时因素，用于典型话语的熔融谱图合成。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2021年|6588-6592|共5页
会议地点
作者
Adrian Łańcucki;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Frequency synthesizers; Frequency modulation; Conferences; Semantics; Predictive models; Real-time systems; Acoustics;

机译：频率合成器;频率调制;会议;语义;预测模型;实时系统;声学;

相似文献

外文文献
中文文献
专利

1. PITCHING MECHANICS IN FEMALE YOUTH FASTPITCH SOFTBALL [J] . Gretchen D. Oliver, Hillary A. Plummer, Jessica K. Washington, International Journal of Sports Physical Therapy . 2018,第3期

机译：女性青年快速投球垒球的投球力学
2. Pitch models of Mandarin text-to-speech [J] . SHAO Yan-qiu, SUI Zhi-fang, HAN Ji-qing 哈尔滨工业大学学报（英文版） . 2009,第002期

机译：普通话转语音的音高模型
3. PROSODY PREDICTION FOR TAMIL TEXT-TO-SPEECH SYNTHESIZER USING SENTIMENT ANALYSIS [J] . Vaibhavi Rajendran, G Bharadwaja Kumar Asian Journal of Pharmaceutical and Clinical Research . 2017,第13期

机译：基于情感分析的泰米尔语文本合成器的质体预测
4. Waveform Generation for Text-to-speech Synthesis Using Pitch-synchronous Multi-scale Generative Adversarial Networks [C] . Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, IEEE International Conference on Acoustics, Speech and Signal Processing . 2019

机译：音高同步多尺度生成对抗性网络用于文本到语音合成的波形生成
5. Improving high quality concatenative text-to-speech synthesis using the circular linear prediction model. [D] . Shukla, Sunil Ravindra. 2007

机译：使用圆形线性预测模型改善高质量的串联文本到语音合成。
6. PITCHING MECHANICS IN FEMALE YOUTH FASTPITCH SOFTBALL [O] . Gretchen D. Oliver, Hillary A. Plummer, Jessica K. Washington, 2018

机译：女性青少年快攻软体球的俯仰力学
7. Modeling pitch trajectories in fastpitch softball [O] . Jean M. Clark, Meredith L. Greer, Mark D. Semon 2015

机译：在FastPitch Soldball中建模音调轨迹
8. External Store Airloads Prediction Technique. Volume II. Detailed Data. Book 4. MER Carriage Normal Force and Pitching Moment Predictions. [R] . rudnicki, a. r. jr waggoner, e. g. jr 1978

机译：外部存储空气负荷预测技术。第二卷。详细数据。书4. mER运输法向力和俯仰力矩预测。

Fastpitch: Parallel Text-to-Speech with Pitch Prediction

摘要

著录项

相似文献

相关主题

期刊订阅