【24h】

Giving Voices to Multimodal Applications

机译:为多模式应用程序献声

获取原文

摘要

The use of speech interaction is important and useful in a wide range of applications. It is a natural way of interaction and it is easy to use by people in general. The development of speech enabled applications is a big challenge that increases if several languages are required, a common scenario, for example, in Europe. Tackling this challenge requires the proposal of methods and tools that foster easier deployment of speech features, harnessing developers with versatile means to include speech interaction in their applications. Besides, only a reduced variety of voices are available (sometimes only one per language) which raises problems regarding the fulfillment of user preferences and hinders a deeper exploration regarding voices' adequacy to specific applications and users. In this article, we present some of our contributions to these different issues: (a) our generic modality that encapsulates the technical details of using speech synthesis; (b) the process followed to create four new voices, including two young adult and two elderly voices; and (c) some initial results exploring user preferences regarding the created voices. The preliminary studies carried out targeted groups including both young and older-adults and addressed: (a) evaluation of the intrinsic properties of each voice; (b) observation of users while using speech enabled interfaces and elic-itation of qualitative impressions regarding the chosen voice and the impact of speech interaction on user satisfaction; and (c) ranking of voices according to preference. The collected results, albeit preliminary, yield some evidence of the positive impact speech interaction has on users, at different levels. Additionally, results show interesting differences among the voice preferences expressed by both age groups and genders.
机译:语音交互的使用在广泛的应用中非常重要和有用。这是一种自然的互动方式,一般人都易于使用。具有语音功能的应用程序的开发是一个巨大的挑战,如果需要多种语言(例如在欧洲,这是一种常见的情况),该挑战将会增加。为了应对这一挑战,需要提出一些方法和工具,以促进语音功能的更容易部署,并利用开发人员的多种手段在其应用程序中包括语音交互。此外,只能使用较少种类的声音(有时每种语言只有一种),这在满足用户喜好方面引起了问题,并阻碍了对声音是否适合特定应用程序和用户的更深入的探索。在本文中,我们介绍了我们对这些不同问题的一些贡献:(a)我们的通用形式,封装了使用语音合成的技术细节; (b)产生四个新声音的过程,包括两个年轻人声音和两个老年人声音; (c)一些初步结果探讨了用户对所创建声音的偏好。初步研究进行了有针对性的人群,包括年轻人和老年人,并讨论了:(a)评估每种声音的内在特性; (b)在使用启用语音的界面时观察用户,并激发关于所选语音的定性印象以及语音交互对用户满意度的影响; (c)根据喜好对声音进行排名。收集的结果尽管是初步的,但仍提供了语音交互在不同级别上对用户产生积极影响的一些证据。此外,结果显示,年龄组和性别所表达的语音偏好之间存在有趣的差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号