首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops >Visual Focus of Attention Estimation in 3D Scene with an Arbitrary Number of Targets
【24h】

Visual Focus of Attention Estimation in 3D Scene with an Arbitrary Number of Targets

机译:随着任意数量的目标3D场景中注意估算的视觉焦点

获取原文

摘要

Visual Focus of Attention (VFOA) estimation in conversation is challenging as it relies on difficult to estimate information (gaze) combined with scene features like target positions and other contextual information (speaking status) allowing to disambiguate situations. Previous VFOA models fusing all these features are usually trained for a specific setup and using a fixed number of interacting people, and should be retrained to be applied to another one, which limits their usability. To address these limitations, we propose a novel deep learning method that encodes all input features as a fixed number of 2D maps, which makes the input more naturally processed by a convolutional neural network, provides scene normalization, and allows to consider an arbitrary number of targets. Experiments performed on two publicly available datasets demonstrate that the proposed method can be trained in a cross-dataset fashion without loss in VFOA accuracy compared to intra-dataset training.
机译:关注的视觉焦点(VFOA)对话中的估计是具有挑战性的,因为它依赖于难以估计的信息(Gaze)与场景特征相结合,如目标位置和其他上下文信息(口语状态),允许消除情况。 以前的VFOA模型融合所有这些功能通常是针对特定设置和使用固定数量的交互人员培训,并且应该被撤退以应用于另一个,这限制了其可用性。 为了解决这些限制,我们提出了一种新颖的深度学习方法,可以将所有输入特征编码为固定数量的2D地图,这使得卷积神经网络更自然地处理的输入提供了场景归一化,并允许考虑任意数量 目标。 在两个公共数据集上进行的实验表明,与数据集训练相比,所提出的方法可以以跨数据集时尚训练,而不会损失VFOA精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号