A VoiceFont Creation Framework for Generating Personalized Voices

Takashi SAITO; Masaharu SAKAMOTO

首页> 外文期刊>IEICE Transactions on Information and Systems >A VoiceFont Creation Framework for Generating Personalized Voices

【24h】

A VoiceFont Creation Framework for Generating Personalized Voices

机译：用于生成个性化语音的VoiceFont创建框架

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new framework for effectively creating VoiceFonts for speech synthesis. A VoiceFont in this paper represents a voice inventory aimed at generating personalized voices. Creating well-formed voice inventories is a time-consuming and laborious task. This has become a critical issue for speech synthesis systems that make an attempt to synthesize many high quality voice personalities. The framework we propose here aims to drastically reduce the burden with a twofold approach. First, in order to substantially enhance the accuracy and robustness of automatic speech segmentation, we introduce a multi-layered speech segmentation algorithm with a new measure of segmental reliability. Secondly, to minimize the amount of human intervention in the process of VoiceFont creation, we provide easy-to-use functions in a data viewer and compiler to facilitate checking and validation of the automatically extracted data. We conducted experiments to investigate the accuracy of the automatic speech segmentation, and its robustness to speaker and style variations. The results of the experiments on six speech corpora with a fairly large variation of speaking styles show that the speech segmentation algorithm is quite accurate and robust in extracting segments of both phonemes and accentual phrases. In addition, to subjectively evaluate VoiceFonts created by using the framework, we conducted a listening test for speaker recognizability. The results show that the voice personalities of synthesized speech generated by the VoiceFont-based speech synthesizer are fairly close to those of the donor speakers.

机译：本文提出了一个有效地创建用于语音合成的VoiceFonts的新框架。本文中的VoiceFont代表旨在生成个性化语音的语音清单。创建格式正确的语音清单是一项耗时且费力的任务。对于已经尝试合成许多高质量语音个性的语音合成系统，这已经成为一个关键问题。我们在此提出的框架旨在通过双重途径来大大减轻负担。首先，为了显着提高自动语音分割的准确性和鲁棒性，我们引入了一种具有分段可靠性新度量的多层语音分割算法。其次，为了最大程度地减少在VoiceFont创建过程中的人工干预，我们在数据查看器和编译器中提供了易于使用的功能，以帮助检查和验证自动提取的数据。我们进行了实验，以研究自动语音分割的准确性及其对说话者和样式变化的鲁棒性。对六种语音语料库的实验结果表明，语音语料变化很大，该语音分割算法在提取音素和重音短语的片段时非常准确且鲁棒。此外，为了主观评估使用该框架创建的VoiceFonts，我们对说话者的可识别性进行了听力测试。结果表明，基于VoiceFont的语音合成器生成的合成语音的语音个性与供体说话者的语音个性相当接近。

著录项

来源
《IEICE Transactions on Information and Systems》 |2005年第3期|p.525-534|共10页
作者
Takashi SAITO; Masaharu SAKAMOTO;
展开▼
作者单位

IBM Research, Tokyo Research Laboratory, IBM Japan Ltd., Yamato-shi, 242-8502 Japan;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词
personalized voice; voice font; voice inventory generation; automatic segmentation; corpus-based speech synthesis; speaker recognizability;

机译：个性化语音;语音字体;语音清单生成;自动分割;基于语料库的语音合成;说话人可识别性;

相似文献

外文文献
中文文献
专利

1. Software Framework for the Creation and Application of Personalized Bone and Plate Implant Geometrical Models [J] . Vitkovic Nikola, Mladenovic Srdan, Trifunovic Milan, Journal of healthcare engineering. . 2018,第Pta4期

机译：用于创建和应用个性化骨骼和板材植入物理模型的软件框架
2. A Framework for Value Co-creation through Customization and Personalization in the Context of Machine Tool PSS [J] . Pankaj U. Zine, Makarand S. Kulkarni, Rakesh Chawla, Procedia CIRP . 2014,第2期

机译：机床PSS中通过定制和个性化实现价值共创的框架
3. The creation of an integrated health-information platform: Building the framework to support personalized medicine [J] . WenhamR.M., SullivanD.M., HulseM., Personalized medicine . 2012,第6期

机译：创建一个综合的健康信息平台：建立支持个性化医学的框架
4. The Spoken Web Application Framework - User Generated Content and Service Creation through low-end mobiles [C] . Arun Kumar, Sheetal K. Agarwal, Priyanka Manwani International cross-disciplinary conference on web accessibility 2010 . 2010

机译：口语Web应用程序框架-通过低端移动设备用户生成的内容和服务创建
5. Supporting navigation in auditory interfaces using personalization and multiple synthetic voices. [D] . Shajahan, Peer. 2005

机译：支持使用个性化和多种合成语音在听觉界面中导航。
6. Software Framework for the Creation and Application of Personalized Bone and Plate Implant Geometrical Models [O] . Nikola Vitković, Srđan Mladenović, Milan Trifunović, 2018

机译：创建和应用个性化骨板植入物几何模型的软件框架
7. Software Framework for the Creation and Application of Personalized Bone and Plate Implant Geometrical Models [O] . Nikola Vitković, Srđan Mladenović, Milan Trifunović, 2018

机译：用于创建和应用个性化骨骼和板材植入物理模型的软件框架

A VoiceFont Creation Framework for Generating Personalized Voices

摘要

著录项

相似文献

相关主题

期刊订阅