首页> 外文期刊>Universal access in the information society >Auditory universal accessibility of data tables using naturally derived prosody specification
【24h】

Auditory universal accessibility of data tables using naturally derived prosody specification

机译:使用自然衍生韵律规范的数据表的听觉通用可访问性

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Text documents usually embody visually oriented meta-information in the form of complex visual structures, such as tables. The semantics involved in such objects result in poor and ambiguous text-to-speech synthesis. Although most speech synthesis frameworks allow the consistent control of an abundance of parameters, such as prosodic cues, through appropriate markup, there is no actual prosodic specification to speech-enable visual elements. This paper presents a method for the acoustic specification modelling of simple and complex data tables, derived from the human paradigm. A series of psychoacoustic experiments were set up for providing speech properties obtained from prosodic analysis of natural spoken descriptions of data tables. Thirty blind and 30 sighted listeners selected the most prominent natural rendition. The derived prosodic phrase accent and pause break placement vectors were modelled using the ToBI semiotic system to successfully convey semantically important visual information through prosody control. The quality of the information provision of speech-synthesized tables when utilizing the proposed prosody specification was evaluated by first-time listeners. The results show a significant increase (from 14 to 20 depending on the table type) of the user subjective understanding (overall impression, listening effort and acceptance) of the table data semantic structure compared to the traditional linearized speech synthesis of tables. Furthermore, it is proven that successful prosody manipulation can be applied to data tables using generic specification sets for certain table types and browsing techniques, resulting in improved data comprehension.
机译:文本文档通常以复杂的视觉结构(如表格)的形式体现面向视觉的元信息。此类对象中涉及的语义导致文本到语音合成的不良和模糊。尽管大多数语音合成框架允许通过适当的标记一致地控制大量参数,例如韵律提示,但对于支持语音的视觉元素,没有实际的韵律规范。本文提出了一种基于人类范式的简单和复杂数据表的声学规范建模方法。建立了一系列心理声学实验,以提供从数据表的自然口语描述的韵律分析中获得的语音特性。30 名盲人和 30 名视力正常的听众选出了最突出的自然演绎。使用ToBI符号学系统对衍生的韵律短语重音和停顿中断放置向量进行建模,通过韵律控制成功传达语义上重要的视觉信息。首次听众评估了使用所提出的韵律规范时语音合成表的信息提供质量。结果显示,与传统的线性语音合成表格相比,用户对表格数据语义结构的主观理解(总体印象、听力努力和接受度)显着增加(从14%增加到20%,取决于表格类型)。此外,事实证明,使用某些表类型的通用规范集和浏览技术,可以将成功的韵律操作应用于数据表,从而提高数据理解能力。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号