首页> 外文期刊>Computer speech and language >Enabling effective design of multimodal interfaces for speech-to-speech translation system: An empirical study of longitudinal user behaviors over time and user strategies for coping with errors
【24h】

Enabling effective design of multimodal interfaces for speech-to-speech translation system: An empirical study of longitudinal user behaviors over time and user strategies for coping with errors

机译:启用语音到语音翻译系统的多模式界面的有效设计:对纵向用户行为随时间推移以及用户应对错误策略的实证研究

获取原文
获取原文并翻译 | 示例

摘要

The study provides an empirical analysis of long-term user behavioral changes and varying user strategies during cross-lingual interaction using the multimodal speech-to-speech (S2S) translation system of USC/SAIL. The goal is to inform user adaptive designs of such systems. A 4-week medical-scenario-based study provides the basis for our analysis. The data analyzed includes user interviews, post-session surveys, and the extensive system logs that were post-processed and annotated. The annotations measured the meaning transfer rates using human evaluations and a scale defined here called the concept matching score. First, qualitative data analysis investigates user strategies in dealing with errors, such as repeat, rephrase, change topic, start over, and the participants' self-reported longitudinal adaptation to errors. Post-session surveys explore participant experience with the system and point to a trend of user-perceived increased performance over time. The log data analysis provides further insightful results. Users chose to allow some degradation (84% of original concepts) of their intended meaning to proceed through the system, even after they observed potential errors in the visual output from the speech recognizer. The rejected utterances, on average, had only 25% of the original concepts. This user-filtered outcome, after the complete channel transfer through the S2S system, is that 91 % of the successful turns result in transfer of at least half the intended concepts while 90% of the user rejected turns would have conveyed less than half the intended meaning. The multimodal interface results in 24% relative improvement in the confirmation mode and in 31% relative improvement in the choice mode compared to the speech-only modality. Analysis also showed that users of the multimodal interface temporally change their strategies by accepting more system-produced choices. This user behavior can expedite communication seeking an operating balance between user strategies and system performance factors. Lastly, user utterance length is analyzed. Longer utterances in general imply more information delivered per utterance but potentially at the cost of increased processing degradation. The analysis demonstrates that users reduce their utterance length after unsuccessful turns and increase it after successful turns and that there is a learning effect that increases this behavior over the duration of the study.
机译:该研究使用USC / SAIL的多模式语音转语音(S2S)翻译系统,对跨语言交互期间长期用户的行为变化和变化的用户策略进行了实证分析。目的是告知用户此类系统的自适应设计。为期4周的基于医疗场景的研究为我们的分析提供了基础。分析的数据包括用户访谈,会话后调查以及经过后期处理和注释的大量系统日志。注释使用人工评估和此处定义的量表(即概念匹配分数)来衡量意思转移率。首先,定性数据分析研究了处理错误的用户策略,例如重复,改写,更改主题,重新开始以及参与者对错误的自我报告的纵向适应。会后调查探索了参与者对系统的体验,并指出了随着时间推移用户感知的性能提高趋势。日志数据分析提供了更深入的结果。即使他们观察到语音识别器的视觉输出中存在潜在的错误,用户仍选择允许其预期含义有所降低(占原始概念的84%)通过系统进行。被拒绝的话语平均仅占原始概念的25%。在通过S2S系统完成完整的频道转移之后,用户过滤的结果是91%的成功转弯导致至少一半预期概念的传输,而90%的用户拒绝的转弯将传达的预期目标不到一半含义。与仅语音模式相比,多模式界面在确认模式下导致24%的相对改进,在选择模式下导致31%的相对改进。分析还表明,多模式界面的用户通过接受更多系统产生的选择来暂时改变其策略。这种用户行为可以加快通信速度,从而在用户策略和系统性能因素之间寻求操作上的平衡。最后,分析了用户话语长度。一般而言,更长的发言时间意味着每个发言内容会传递更多的信息,但可能会增加处理质量。该分析表明,用户在不成功的转弯后会减少其发声长度,而在成功的转弯后会增加其发声长度,并且存在一种学习效果,可以在研究期间增加这种行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号