首页> 外文会议>IEEE International Conference on Semantic Computing >Combining Neural Networks and Pattern Matching for Ontology Mining - a Meta Learning Inspired Approach
【24h】

Combining Neural Networks and Pattern Matching for Ontology Mining - a Meta Learning Inspired Approach

机译:结合神经网络和模式匹配进行本体挖掘-一种基于元学习的方法

获取原文

摘要

Several applications dealing with natural language text involve automated validation of the membership in a given category (e.g. France is a country, Gladiator is a movie, but not a country). Meta-learning is a recent and powerful machine learning approach, which goal is to train a model (or a family of models) on a variety of learning tasks, such that it can solve new learning tasks in a more efficient way, e.g. using smaller number of training samples or in less time. We present an original approach inspired by meta-learning and consisting of two tiers of models: for any arbitrary category, our general model supplies high confidence training instances (seeds) for our category-specific models. Our general model is based on pattern matching and optimized for the precision at top N, while its recall is not important. Our category-specific models are based on recurrent neural networks (RNN -s), which recently showed themselves extremely effective in several natural language applications, such as machine translation, sentiment analysis, parsing, and chatbots. By following the meta-learning principles, we are training our highest level (general) model in such a way that our second - tier category -specific models (which are dependent on it) are optimized for the best possible performance in a specific application. This work is important because our approach is capable of verifying membership in an arbitrary category defined by a sequence of words including longer and more complex categories such as Ridley Scott movie or City in southern Germany that are currently not supported by existing manually created ontologies (such as Freebase, Wordnet or Wikidata). Also, our approach uses only raw text, and thus can be useful when there are no such ontologies available, which is a common situation with languages other than English. Even the largest English ontologies are known to have low coverage, insufficient for many practical applications such as automated question answering, which we use here to illustrate the advantages of our approach. We rigorously test it on a number of questions larger than the previous studies and demonstrate that when coupled with a simple answer-scoring mechanism, our meta-learning-inspired approach 1) provides up to 50% improvement over prior approaches that do not use any manually curated knowledge bases and 2) achieves the state of-the-art performance among all the current approaches including those taking advantage of such knowledge bases.
机译:处理自然语言文本的几种应用程序涉及对给定类别中的成员资格的自动验证(例如,法国是一个国家,角斗士是一部电影,但不是一个国家)。元学习是一种最新且功能强大的机器学习方法,其目标是针对各种学习任务训练一个模型(或一系列模型),以便它可以更有效地解决新的学习任务,例如使用较少数量的训练样本或使用较少的时间。我们提出了一种受元学习启发的原始方法,该方法由两层模型组成:对于任意类别,我们的通用模型为特定于类别的模型提供高置信度训练实例(种子)。我们的通用模型基于模式匹配,并针对前N个位置的精度进行了优化,而召回率并不重要。我们基于类别的模型基于递归神经网络(RNN -s),最近它在多种自然语言应用(例如机器翻译,情感分析,解析和聊天机器人)中表现出了极其有效的作用。通过遵循元学习原则,我们正在以一种这样的方式来训练我们的最高级别(通用)模型,即针对特定类别的应用优化第二层类别特定模型(取决于该模型),以实现最佳性能。这项工作很重要,因为我们的方法能够验证由一系列单词定义的任意类别的成员资格,这些单词序列包括更长或更复杂的类别,例如Ridley Scott电影或德国南部的City,目前尚不存在现有手动创建的本体(例如作为Freebase,Wordnet或Wikidata)。另外,我们的方法仅使用原始文本,因此在没有此类本体可用时会很有用,这在英语以外的语言中很常见。甚至最大的英语本体都被认为覆盖率很低,不足以用于许多实际应用,例如自动问答,在这里我们用它来说明我们的方法的优势。我们对比以前的研究更大的问题进行了严格的测试,并证明,结合简单的答案评分机制,我们的基于元学习的方法1)可以比不使用的先前方法提高50%以上任何手动管理的知识库,以及2)在所有当前方法(包括利用此类知识库的方法)中都达到最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号