Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm

机译：基于TD-PSOLA算法的音节串联在西班牙语中唱歌语音合成

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The present work shows the development of a Spanish singing voice synthesizer where a TD-PSOLA algorithm is applied. The main goal of the development was to test the hypothesis that while diphones are linguistically the units with the best intelligibility-flexibility compromise for the purposes of spoken voice synthesis, it is the syllables the best suited units for concatenation singing voice synthesis. Such hypothesis is particularly strong for Spanish, since its rules for syllable construction are comprehensive, relatively simple, and only a handful. To test the hypothesis a relatively small amount of vocals and syllables in Spanish were recorded by a soprano singer at both F4 and C5 tones, with duration of 1 second each (±0.2sec.). The modification of the syllables was carried only in regards to tone and duration. Matlab was used as the programming platform mainly because of the author's relative expertise on it. To evaluate the performance of the system several melodic tasks were asked of it including the singing of a popular Mexican song (Las Mananitas). Results show that a highly intelligible synthesized Spanish singing voice based on syllable concatenation can be achieved with minimum control mechanisms. While the time duration variation introduces very few noticeable digital errors, a transposition of up to a just fourth was possible without generating very obvious digital errors. A variation of 5% (0.05) in the frequency scale corresponds to a semitone variation in the equally tempered modern scale.

机译：目前的工作表明，应用TD-PSOLA算法的西班牙歌唱语音合成器的开发。该开发的主要目标是测试假设，而偶像是语言学的，而是为口语合成口语合成的最佳清晰度灵活性的单位，它是最适合串联歌唱语音合成的音节。这些假设对于西班牙语特别强大，因为它的音节建设规则是全面的，相对简单，而且只有少数。为了测试假设，SPRANO歌手在F4和C5音调中记录了西班牙语中的相对少量的人声和音节，每次持续为1秒（±0.2sec）。音节的修改仅在对音调和持续时间内携带。 Matlab被用作编程平台，主要是因为作者对其的相对专业知识。为了评估系统的表现，提出了几个旋律任务，包括歌唱墨西哥歌曲（Las Mananitas）。结果表明，最小控制机制，可以实现基于音节连接的高度可理解的合成西班牙歌唱语音。虽然时间持续时间变化引入非常少数明显的数字误差，但在不产生非常明显的数字误差的情况下，最多可能的转换可能是可能的。频率尺度中的5％（0.05）的变化对应于同等钢化现代规模的半音变化。

著录项

来源
《WSEAS International Conference on Acoustics Music: Theory Applications》|2012年||共6页
会议地点
作者
ALEJANDRO RAMOS-AMEZQUITA;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP39-53;
关键词
Singing voice; Synthesis; Concatenation; Syllables; Spanish; Time Domain; PSOLA;

机译：唱歌语音;合成;constenation;音节;西班牙语;时域;psola;

相似文献

外文文献
中文文献
专利

1. HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling [J] . Takashi Nose, Misa Kanemoto, Tomoki Koriyama, Computer speech and language . 2015,第1期

机译：基于HMM的表达性歌声合成，具有歌唱风格控制和可靠的音高建模
2. Synthesis of Spontaneous Speech With Syllable Contraction Using State-Based Context-Dependent Voice Transformation [J] . Wu C.-H., Huang Y.-C., Lee C.-H., Audio, Speech, and Language Processing, IEEE Transactions on . 2014,第3期

机译：基于状态的上下文相关语音转换合成带有音节收缩的自发语音
3. A HMM-based mandarin chinese singing voice synthesis system [J] . X. Li, Z. Wang Automatica Sinica, IEEE/CAA Journal of . 2016,第2期

机译：基于HMM的普通话中文语音合成系统。
4. Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm [C] . ALEJANDRO RAMOS-AMEZQUITA Latest advances in acoustics and music . 2012

机译：基于TD-PSOLA算法的音节串联演唱西班牙语语音合成
5. Acoustic models for the analysis and synthesis of the singing voice. [D] . Lee, Matthew E. 2005

机译：用于分析和合成歌声的声学模型。
6. Effects of age sex and syllable structure on voice onset time: Evidence from children’s voiceless aspirated stops [O] . Vickie Y. Yu, Luc F. De Nil, Elizabeth W. Pang -1

机译：年龄性别和音节结构对语音发作时间的影响：来自儿童无声吸气止动的证据
7. Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking [O] . Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, 2019

机译：基于网络的基于网络的随机调制基于网络的随机调制，用于DNN的歌声语音合成和神经双跟踪的基于网络的随机调制

Singing voice synthesis in Spanish by concatenation of syllables based on the TD-PSOLA algorithm

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅