首页> 外文期刊>Data & Knowledge Engineering >Corpus-based semantic role approach in information retrieval
【24h】

Corpus-based semantic role approach in information retrieval

机译:信息检索中基于语料库的语义角色方法

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, a method to determine the semantic role for the constituents of a sentence is presented. This method, named SemRol, is a corpus-based approach that uses two different statistical models, conditional Maximum Entropy (ME) Probability Models and the TiMBL program, a Memory-based Learning. It consists of three phases that make use of features using words, lemmas, PoS tags and shallow parsing information. Our method introduces a new phase in the Semantic Role Labeling task which has usually been approached as a two phase procedure consisting of recognition and labeling arguments. From our point of view, firstly the sense of the verbs in the sentences must be disambiguated. That is why depending on the sense of the verb a different set of roles must be considered. Regarding the labeling arguments phase, a tuning procedure is presented. As a result of this procedure one of the best sets of features for the labeling arguments task is detected. With this set, that is different for TiMBL and ME, precisions of 76.71% for TiMBL or 70.55% for ME, are obtained. Furthermore, the semantic role information provided by our SemRol method could be used as an extension of Information Retrieval or Question Answering systems. We propose using this semantic information as an extension of an Information Retrieval system in order to reduce the number of documents or passages retrieved by the system.
机译:本文提出了一种确定句子成分语义作用的方法。这种名为SemRol的方法是一种基于语料库的方法,它使用两种不同的统计模型,即条件最大熵(ME)概率模型和TiMBL程序(基于内存的学习)。它由三个阶段组成,这三个阶段利用了单词,词条,PoS标签和浅层解析信息的特征。我们的方法在语义角色标记任务中引入了一个新阶段,通常将其分为由识别和标记参数组成的两个阶段。从我们的角度来看,首先必须消除句子中动词的意义。因此,根据动词的含义,必须考虑不同的角色集。关于标签参数阶段,提出了一种调整过程。作为此过程的结果,将检测出用于标签参数任务的最佳功能集之一。使用此设置,对于TiMBL和ME来说是不同的,获得的精度对于TiMBL为76.71%,对于ME为70.55%。此外,我们的SemRol方法提供的语义角色信息可以用作信息检索或问答系统的扩展。我们建议使用此语义信息作为信息检索系统的扩展,以减少系统检索的文档或段落的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号