首页> 外文期刊>Research Disclosure >Scene and Natural Language Understanding Fusion towards Enhanced In-Vehicle Virtual Assistant
【24h】

Scene and Natural Language Understanding Fusion towards Enhanced In-Vehicle Virtual Assistant

机译:现场和自然语言理解融合到加强车载虚拟助手

获取原文
获取原文并翻译 | 示例
       

摘要

The invention introduces a system and a methodology that incorporates scene and natural language understanding to resolve ambiguities in user's verbal (natural) interaction with the in-vehicle virtual assistant with special focus to POIs (Places of interest) on the road that are visible to the user and the vehicle. The proposal assumes the existence of external sensors {including scene perception SW capabilities) in the vehicles for automated and/or autonomous driving. We address ambiguities in user's spoken intent in his/her interaction with in-vehicle virtual assistant and propose to resolve them via understanding the scene outside the vehicle (leveraging the external visual sensors in the vehicle). For example, ("What are the opening hours of this restaurant?", "is there a parking space over there?"). Understanding the scene, enables the resolution of queries that are (1) not only specific to a particular location but also other objects in the scene that dynamically change as people, cars, objects on the street (2) reduce the need for an in-vehicle driver monitoring (gaze/head pose) system to understand the intent via gaze User intent understanding is a key challenge for conversational virtual assistants. There are already systems that address this challenge via multi-modal user data with focus specifically on gaze information. Our invention proposes the usage of external sensors and scene perception ability, that already exist in the vehicle for automated driving, to understand user's spoken interaction. We propose to fuse real-time external scene understanding output with user's spoken query and thus enhance the natural language understanding ability of the in-vehicle virtual assistant. We also show automation on how to delegate the resolution of intent to the right module in the system (e.g., automatic park assist) "Disclosed anonymously" This invention is a new and non-obvious methodology that performs intent resolution in speech queries that relate to POIs using multi-modal (speech and visual) data 1.Determines whether there is any reference to an external object that requires scene perception, using the NLU (natural language understanding) system (e.g., specific restaurant, person on the street, vehicle, person").
机译:本发明介绍了一种系统和方法,包括场景和自然语言理解,以解决用户口头(自然)与车载虚拟助手的含糊不清的含糊不清,以专注于可见的道路上的POIS(景点)用户和车辆。该提案假设在用于自动化和/或自主驾驶的车辆中存在外部传感器{包括场景感知SW功能)。我们在与车载虚拟助手中的互动中讨论了用户口语意图,并建议通过了解车辆外面的场景(利用车辆中的外部视觉传感器)来解决它们。例如,(“这家餐馆的开放时间是多少?”,“那里有一个停车位吗?”)。了解场景,启用查询的查询,这些查询不仅特定于特定位置,还可以在场景中动态地改变的场景中的其他对象,街道上的对象(2)减少了对in的需要车辆驾驶员监控(凝视/头部姿势)系统通过凝视用户意图了解意图是对话虚拟助手的关键挑战。已经有系统通过多模态用户数据来解决这一挑战,专注于凝视信息。我们的发明提出了外部传感器和现场感知能力的使用,该能力已经存在于用于自动驾驶的车辆中,以了解用户的口头交互。我们建议使用用户的口语查询融合实时外部场景了解输出,从而提高车载虚拟助手的自然语言理解能力。我们还显示了如何委托如何委派到右侧模块的右侧模块的分辨率“本发明是一种新的和非明显方法,这些方法在与之相关的语音查询中执行意图分辨率使用多模态(语音和视觉)数据的POI 1.使用NLU(自然语言理解)系统(例如,特定餐厅,街道上的人,车辆,车辆,车辆上的人,人”)。

著录项

  • 来源
    《Research Disclosure》 |2021年第686期|1966-1966|共1页
  • 作者

  • 作者单位
  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号