【24h】

Structured Annotations of Web Queries

机译:Web查询的结构化注释

获取原文

摘要

Queries asked on web search engines often target structured data, such as commercial products, movie showtimes, or airline schedules. However, surfacing relevant results from such data is a highly challenging problem, due to the unstructured language of the web queries, and the imposing scalability and speed requirements of web search. In this paper, we discover latent structured semantics in web queries and produce Structured Annotations for them. We consider an annotation as a mapping of a query to a table of structured data and attributes of this table. Given a collection of structured tables, we present a fast and scalable tagging mechanism for obtaining all possible annotations of a query over these tables. However, we observe that for a given query only few are sensible for the user needs. We thus propose a principled probabilistic scoring mechanism, using a generative model, for assessing the likelihood of a structured annotation, and we define a dynamic threshold for filtering out misinterpreted query annotations. Our techniques are completely unsupervised, obviating the need for costly manual labeling effort. We evaluated our techniques using real world queries and data and present promising experimental results.
机译:在Web搜索引擎上询问的查询经常针对结构化数据,例如商业产品,电影显示或航空公司时间表。然而,由于Web查询的非结构化语言以及Web搜索的强加性和速度要求,这些数据的相关结果是一个非常具有挑战性的问题,以及Web搜索的巨大可扩展性和速度要求。在本文中,我们发现Web查询中的潜在结构化语义,并为它们产生结构化注释。我们将注释作为查询的映射到该表的结构化数据表和属性。鉴于组织表的集合,我们呈现了一种快速和可伸缩的标记机制,用于获得这些表格上查询的所有可能的注释。但是,我们观察到,对于给定的查询,只有很少的是用户需求的明智。因此,我们使用生成模型提出了一个原则性的概率评分机制,用于评估结构化注释的可能性,并且我们定义了用于过滤误解的查询注释的动态阈值。我们的技术完全无监督,避免了对昂贵的手动标签努力的需求。我们使用现实世界查询和数据进行了评估了我们的技术,并提出了有希望的实验结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号