首页> 外文会议>Irish Signals and Systems Conference >Integrating a Voice Analysis-Synthesis System with a TTS Framework for Controlling Affect and Speaker Identity

【24h】

Integrating a Voice Analysis-Synthesis System with a TTS Framework for Controlling Affect and Speaker Identity

机译：将语音分析合成系统与TTS框架集成，用于控制影响和扬声器标识

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper reports an experiment exploring how a voice analysis-synthesis system, GlórCáil, can be used to add expressiveness to the synthetic voice in text-to-speech (TTS) systems. This implementation focuses on the Irish ABAIR TTS voices, where such voice control would facilitate many current/envisaged applications. GlórCáil allows voice control of synthesized speech, and for this experiment was integrated into a DNN-based TTS framework. Utterances were generated with f0, voice quality and vocal tract parameter manipulations targeting shifts in speaker identity and in the affective coloring of utterances. Scaling factors used for the manipulations were suggested in an earlier study. They involved global changes without sentence-internal dynamic variation, with a view to ascertain whether such global shifts might alter listeners’ perception of speaker identity and affect. Results demonstrate affect shifts compatible with expectations. However, there were confounding factors. The female/child voices were poorly differentiated, which was expected given the similarity in the scaling factors used. The affect transformations suggest the baseline voice used had an intrinsically sad quality so that there is weak differentiation between the sad and no emotion stimuli. Male angry voice was the least successful, suggesting that dynamic, within-utterance variation is essential for the signaling of certain affects.

机译：本文报告了一个实验，探索了语音分析合成系统，Glórcáil可用于为文本到语音（TTS）系统中的合成声音添加表现力。此实施侧重于爱尔兰人ABAIR TTS声音，其中这种语音控制将有助于许多当前/设想的应用程序。 Glórcáil允许合成语音的语音控制，并且对于该实验集成到基于DNN的TTS框架中。用f产生的话语 0 ，语音质量和声乐道参数操纵瞄准扬声器身份和情感色彩的情感着色。在早期的研究中提出了用于操纵的缩放因子。它们涉及没有句子内部动态变化的全局变化，以确定此类全局变化是否可能会改变听众对扬声器身份的感知和影响。结果展示了与期望兼容的变化。但是，有混杂因素。女性/儿童的声音差异很差，预期是在使用的缩放因子中的相似性。影响转型表明，所使用的基线语音具有本质上悲伤的质量，以便悲伤和无情感刺激之间存在较弱的分化。男性愤怒的声音是最不成功的，这表明动态，在发声情况下，对某些影响的信号传导至关重要。

著录项

来源
《Irish Signals and Systems Conference》|2021年|1-6|共6页
会议地点
作者
Andy Murphy; Irena Yanushevskaya; Ailbhe Ní Chasaide; Christer Gobl;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Modulation;

机译：调制;

相似文献

外文文献
中文文献
专利

1. Language, identity and migration: voices from transnational speakers and communities [J] . Ahn Soojin International Journal of Bilingual Education and Bilingualism . 2019,第5a6期

机译：语言，身份和移民：跨国发言人和社区的声音
2. Identity authentication by sensed acoustic voices from a speaking person using an efficient GMM-SVM dual modeling framework [J] . Ding Ing-Jr, Lin Zih-Jheng Microsystem technologies . 2018,第1期

机译：通过高效的GMM-SVM双模拟框架，来自讲话人的传感声音的身份认证
3. Voice aftereffects of adaptation to speaker identity. [J] . Zaske R, Schweinberger SR, Kawahara H Hearing Research: An International Journal . 2010,第1a2期

机译：适应说话者身份的语音后效应。
4. A Hybrid Control Framework for Large-Scale Battery Integration to the Power System for Stability Analysis [C] . Roghieh A. Biroon, Pierluigi Pisu, David Schoenwald Annual American Control Conference . 2020

机译：用于大规模电池集成到电源系统以进行稳定性分析的混合控制框架
5. Students Developing Voices in New Learning Ecologies: Voice, Identity, Position and Function as a Framework to Support Multimodal Investigations of Learning Mathematics over Multiple Timescales [D] . El Chidiac, Fady 2018

机译：学生在新的学习生态学中开发声音：语音，身份，职位和功能，作为支持多级时间尺度学习数学研究的框架
6. Attractiveness and distinctiveness between speakers voices in naturalistic speech and their faces are uncorrelated [O] . Romi Zäske, Verena Gabriele Skuk, Stefan R. Schweinberger 2020

机译：扬声器在自然主义语音和脸部的声音之间的吸引力和独特性是不相关的
7. Identity and integration of Russian speakers in the Baltic states: a framework for analysis [O] . Cheskin, Ammon 2015

机译：波罗的海国家的俄语使用者的身份和整合：分析框架

Integrating a Voice Analysis-Synthesis System with a TTS Framework for Controlling Affect and Speaker Identity

摘要

著录项

相似文献

相关主题

期刊订阅