首页> 外文会议>ACM symposium on Applied Computing >Learning the ontological theory of an information extraction system in the multi-predicate ILP setting
【24h】

Learning the ontological theory of an information extraction system in the multi-predicate ILP setting

机译:学习多谓式ILP设置中信息提取系统的本体论理论

获取原文

摘要

In recent years, numerous works have been carried out to design Information Extraction (IE) systems able to extract genic interaction networks from text. Usually, the extraction procedure is completed by so-called extraction patterns, which are often limited to map textual fragments to a single semantic relation. Such poor representations do not take into account the complexity of the data processed by biologists. IE systems need sophisticated representations, encoded with ontologies, allowing the definition of multiple relations, and of the (possibly recursive) dependencies between them. Up to now, Machine Learning techniques used to acquire extraction patterns, i.e. binary or multi-class learners, reflect those representation restrictions. They assume independence between target predicates, and do not handle recursion. In this paper, we use Inductive Logic Programming in a multi-predicate setting to learn extraction patterns fitted to an ontological context. Multi-predicate ILP is an important paradigm which allows to learn recursive theories. We experimented our framework on a Bacillus subtilis bacterium text corpus, in which we reach a global recall of 67.7% and a precision of 75.5% in ten-fold cross-validation.
机译:近年来,已经开展了众多作品来设计信息提取(即)能够从文本中提取基因交互网络的系统。通常,通过所谓的提取模式完成提取过程,其通常限于将文本碎片映射到单个语义关系。这种糟糕的表示没有考虑生物学家处理的数据的复杂性。 IE系统需要使用本体编码的复杂表示,允许定义多个关系,以及它们之间的(可能递归)的依赖性。到目前为止,用于获取提取模式的机器学习技术,即二进制或多级学习者,反映了那些表示限制。它们假设目标谓词之间的独立性,并且不处理递归。在本文中,我们在多谓词设置中使用电感逻辑编程,以学习装配到本体上下文的提取模式。多谓词ILP是一个重要的范式,允许学习递归理论。我们在枯草芽孢杆菌细菌文本语料库上尝试了我们的框架,其中我们达到了67.7%的全球召回,并且在十倍交叉验证中的精度为75.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号