首页> 外文会议>Machine learning for multimodal interaction >Combining User Modeling and Machine Learning to Predict Users' Multimodal Integration Patterns
【24h】

Combining User Modeling and Machine Learning to Predict Users' Multimodal Integration Patterns

机译:结合用户建模和机器学习来预测用户的多模式集成模式

获取原文
获取原文并翻译 | 示例

摘要

Temporal as well as semantic constraints on fusion are at the heart of multimodal system processing. The goal of the present work is to develop user-adaptive temporal thresholds with improved performance characteristics over state-of-the-art fixed ones, which can be accomplished by leveraging both empirical user modeling and machine learning techniques to handle the large individual differences in users' multimodal integration patterns. Using simple Naive Bayes learning methods and a leave-one-out training strategy, our model correctly predicted 88% of users' mixed speech and pen signal input as either unimodal or multimodal, and 91% of their multimodal input as either sequentially or simultaneously integrated. In addition to predicting a user's multimodal pattern in advance of receiving input, predictive accuracies also were evaluated after the first signal's end-point detection-the earliest time when a speech/pen multimodal system makes a decision regarding fusion. This system-centered metric yielded accuracies of 90% and 92%, respectively, for classification of unimodal/multimodal and sequential/simultaneous input patterns. In addition, empirical modeling revealed a .92 correlation between users' multimodal integration pattern and their likelihood of interacting multimodally, which may have accounted for the superior learning obtained with training over heterogeneous user data rather than data partitioned by user subtype. Finally, in large part due to guidance from user-modeling, the techniques reported here required as little as 15 samples to predict a "surprise" user's input patterns.
机译:融合的时间和语义约束是多模式系统处理的核心。本工作的目标是开发具有用户性能的时间阈值,该阈值具有优于最新的固定阈值的性能特征,这可以通过利用经验用户建模和机器学习技术来处理其中的较大个体差异来实现。用户的多模式集成模式。使用简单的朴素贝叶斯(Bay)学习方法和一劳永逸的训练策略,我们的模型可以正确预测88%的用户混合语音和笔信号输入为单峰或多峰,而其91%的多峰输入为顺序或同时集成。除了在接收输入之前预测用户的多峰模式之外,还在第一个信号的端点检测之后(语音/笔多峰系统做出融合决定的最早时间)对预测准确性进行评估。对于单峰/多峰和顺序/同时输入模式的分类,以系统为中心的度量分别产生90%和92%的精度。此外,经验模型揭示了用户的多峰集成模式与他们进行多峰交互的可能性之间有0.92的相关性,这可能解释了通过对异构用户数据(而不是按用户亚类型划分的数据)进行训练而获得的卓越学习。最后,在很大程度上是由于用户建模的指导,此处报告的技术仅需要15个样本即可预测“惊奇”的用户输入模式。

著录项

  • 来源
  • 会议地点 Bethesda MD(US);Bethesda MD(US)
  • 作者单位

    Natural Interaction Systems 10260 Sw Greenburg Road Suite 400 Portland, OR 97223;

    Natural Interaction Systems 10260 Sw Greenburg Road Suite 400 Portland, OR 97223 Center for Human-Computer Communication Computer Science Department Oregon Health and Science University Beaverton, OR 97006;

    Natural Interaction Systems 10260 Sw Greenburg Road Suite 400 Portland, OR 97223 Center for Human-Computer Communication Computer Science Department Oregon Health and Science University Beaverton, OR 97006;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 程序语言、算法语言;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号