首页> 外文期刊>Engineering Applications of Artificial Intelligence >Improving relevance in a content pipeline via syntactic generalization
【24h】

Improving relevance in a content pipeline via syntactic generalization

机译:通过语法概括提高内容管道中的相关性

获取原文
获取原文并翻译 | 示例
           

摘要

This is a report from the field on a linguistic-based relevance technology based on learning of parse trees for processing, classification and delivery of a stream of texts. We describe the content pipeline for eBay entertainment domain which employs this technology, and show that text processing relevance is the main bottleneck for its performance. A number of components of the content pipeline such as content mining, aggregation, deduplication, opinion mining, integrity enforcing need to rely on domain-independent efficient text classification, entity extraction and relevance assessment operations. Text relevance assessment is based on the operation of syntactic generalization (SG) which finds a maximum common sub-tree for a pair of parse trees for sentences. Relevance of two portions of texts is then defined as a cardinality of this sub-tree. SG is intended to substitute keyword-based analysis for more accurate assessment of relevance which takes phrase-level and sentence-level information into account. In the partial case where short expression are commonly used terms such as Facebook likes, SG ascends to the level of categories and a reasoning technique is required to map these categories in the course of relevance assessment. A number of content pipeline components employ web mining which needs SG to compare web search results. We describe how SG works in a number of components in the content pipeline including personalization and recommendation, and provide the evaluation results for eBay deployment.
机译:这是该领域的一项基于语言的关联技术的报告,该技术基于学习分析树以进行文本流的处理,分类和传递。我们描述了采用此技术的eBay娱乐领域的内容管道,并表明文本处理的相关性是其性能的主要瓶颈。内容管道的许多组件(例如内容挖掘,聚合,重复数据删除,观点挖掘,完整性强制)需要依赖于独立于域的有效文本分类,实体提取和相关性评估操作。文本相关性评估基于句法概括(SG)的操作,该语法为一对句子的分析树找到最大的公共子树。然后将文本两部分的相关性定义为该子树的基数。 SG旨在替代基于关键字的分析,以更准确地评估相关性,并考虑到短语级别和句子级别的信息。在短表达是常用术语(例如Facebook喜欢)的部分情况下,SG会提升到类别级别,并且需要一种推理技术来在相关性评估过程中映射这些类别。许多内容管道组件采用Web挖掘,这需要SG来比较Web搜索结果。我们描述了SG如何在内容管道中的许多组件中工作,包括个性化和推荐,并提供了eBay部署的评估结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号