首页> 外文期刊>Computer speech and language >On-line experimental methods to evaluate text-to-speech (TTS) synthesis: effects of voice gender and signal quality on intelligibility, naturalness and preference
【24h】

On-line experimental methods to evaluate text-to-speech (TTS) synthesis: effects of voice gender and signal quality on intelligibility, naturalness and preference

机译:评估语音合成的在线实验方法:语音性别和信号质量对清晰度,自然度和偏好的影响

获取原文
获取原文并翻译 | 示例
       

摘要

Three experiments are reported that use new experimental methods for the evaluation of text-to-speech (TTS) synthesis from the user's perspective. Experiment 1, using sentence stimuli, and Experiment 2, using discrete "call centre" word stimuli, investigated the effect of voice gender and signal quality on the intelligibility of three concatenative TTS synthesis systems. Accuracy and search time were recorded as on-line, implicit indices of intelligibility during phoneme detection tasks. It was found that both voice gender and noise affect intelligibility. Results also indicate interactions of voice gender, signal quality, and TTS synthesis system on accuracy and search time. In Experiment 3 the method of paired comparisons was used to yield ranks of naturalness and preference. As hypothesized, preference and naturalness ranks were influenced by TTS system, signal quality and voice, in isolation and in combination. The pattern of results across the four dependent variables-accuracy, search time, naturalness, preference-was consistent. Natural speech surpassed synthetic speech, and TTS system C elicited relatively high scores across all measures. Intelligibility, judged naturalness and preference are modulated by several factors and there is a need to tailor systems to particular commercial applications and environmental conditions.
机译:报告了三个实验,这些实验使用新的实验方法从用户的角度评估文本到语音(TTS)的合成。实验1,使用句子刺激,实验2,使用离散的“呼叫中心”单词刺激,研究了语音性别和信号质量对三个串联TTS合成系统的清晰度的影响。在音素检测任务期间,准确性和搜索时间被记录为在线,可懂度的隐性指标。发现语音性别和噪声都会影响清晰度。结果还表明语音性别,信号质量和TTS合成系统在准确性和搜索时间上的相互作用。在实验3中,使用配对比较的方法得出自然度和偏好度的等级。如假设的那样,偏好和自然等级受TTS系统,信号质量和语音的孤立或组合影响。四个因变量(准确度,搜索时间,自然度,偏好)的结果模式是一致的。自然语音超越了合成语音,并且TTS系统C在所有指标上均获得了较高的分数。可理解性,判断的自然性和偏好受几个因素影响,因此有必要针对特定​​的商业应用和环境条件定制系统。

著录项

  • 来源
    《Computer speech and language》 |2005年第2期|p. 129-146|共18页
  • 作者单位

    MARCS Auditory Laboratories, School of Psychology, University of Western Sydney-Bankstown campus, Locked Bag 1797, Penrith South DC, NSW 1797, Australia;

    MARCS Auditory Laboratories, School of Psychology, University of Western Sydney-Bankstown campus, Locked Bag 1797, Penrith South DC, NSW 1797, Australia;

    Appen Pty Ltd., NSW, Australia;

    MARCS Auditory Laboratories, School of Psychology, University of Western Sydney-Bankstown campus, Locked Bag 1797, Penrith South DC, NSW 1797, Australia;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 计算技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-18 02:12:28

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号