首页> 外文期刊>Engineering Applications of Artificial Intelligence >Transfer learning of syntactic structures for building taxonomies for search engines
【24h】

Transfer learning of syntactic structures for building taxonomies for search engines

机译:转移学习语法结构以建立搜索引擎分类标准

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We apply a paradigm of transfer learning to build a taxonomy of entities intended to improve search engine relevance in a vertical domain. The taxonomy construction process starts from the seed entities and mines available source domains for new entities associated with these seed entities. New entities are formed by applying the machine learning of syntactic parse trees (their generalizations) to the search results for existing entities to form commonalities between them. These commonality expressions then form parameters of existing entities, and are turned into new entities at the next learning iteration. To match natural language expressions between source and target domains, we use syntactic generalization, an operation which finds a set of maximal common sub-trees of constituency parse trees of these expressions. Taxonomy and syntactic generalization are applied to relevance improvement in search and text similarity assessment. We conduct an evaluation of the search relevance improvement in vertical and horizontal domains and observe significant contribution of the learned taxonomy in the former, and a noticeable contribution of a hybrid system in the latter domain. We also perform industrial evaluation of taxonomy and syntactic generalization-based text relevance assessment and conclude that a proposed algorithm for automated taxonomy learning is suitable for integration into industrial systems. The proposed algorithm is implemented as a component of Apache OpenNLP project.
机译:我们采用转移学习的范例来建立实体的分类法,以改善垂直领域中搜索引擎的相关性。分类法构建过程从种子实体开始,并为与这些种子实体关联的新实体挖掘可用的源域。通过将语法分析树的机器学习(它们的概括)应用于现有实体的搜索结果以形成它们之间的共性,从而形成新实体。这些公共性表达式然后形成现有实体的参数,并在下一次学习迭代时变成新的实体。为了匹配源域和目标域之间的自然语言表达,我们使用了语法概括,该操作可找到一组最大的子域,这些子域构成了这些表达的选区解析树。分类法和句法概括被应用于搜索和文本相似性评估中的相关性改进。我们对垂直和水平域中的搜索相关性改进进行了评估,并观察到学习分类法在前者中的显着贡献,以及混合系统在后者中的显着贡献。我们还进行了分类学的工业评估和基于句法归纳的文本相关性评估,并得出结论,提出的自动分类学学习算法适合于集成到工业系统中。该算法被实现为Apache OpenNLP项目的一个组成部分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号