首页> 外文期刊>ACM Transactions on Information Systems >Shallow and Deep Syntactic/Semantic Structures for Passage Reranking in Question-Answering Systems
【24h】

Shallow and Deep Syntactic/Semantic Structures for Passage Reranking in Question-Answering Systems

机译:问答系统中段落重新排序的浅层和深层句法/语义结构

获取原文
获取原文并翻译 | 示例

摘要

In this article, we extensively study the use of syntactic and semantic structures obtained with shallow and full syntactic parsers for answer passage reranking. We propose several dependency and constituent-based structures, also enriched with Linked Open Data (LD) knowledge to represent pairs of questions and answer passages. We encode such tree structures in learning-to-rank (L2R) algorithms using tree kernels, which can project them in tree substructure spaces, where each dimension represents a powerful syntactic/semantic feature. Additionally, since we define links between question and passage structures, our tree kernel spaces also include relational structural features. We carried out an extensive comparative experimentation of our models for automatic answer selection benchmarks on different TREC QA corpora as well as the newer Wikipedia-based dataset, namely WikiQA, which has been widely used to test sentence rerankers. The results consistently demonstrate that our structural semantic models achieve the state of the art in passage reranking. In particular, we derived the following important findings: (i) relational syntactic structures are essential to achieve superior results; (ii) models trained with dependency trees can outperform those trained with shallow trees, e.g., in case of sentence reranking; (iii) external knowledge automatically generated with focus and question classifiers is very effective; and (iv) the semantic information derived by LD and incorporated in syntactic structures can be used to replace the knowledge provided by the above-mentioned classifiers. This is a remarkable advantage as it enables our models to increase coverage and portability over new domains.
机译:在本文中,我们广泛研究了通过浅层和完整句法解析器获得的句法和语义结构在答案段落重排中的用途。我们提出了几种基于依存关系和基于成分的结构,并且还丰富了链接开放数据(LD)知识,以表示成对的问题和答案。我们使用树核在学习排名(L2R)算法中对此类树结构进行编码,该算法可将其投影到树子结构空间中,其中每个维度代表强大的句法/语义特征。另外,由于我们定义了问题和段落结构之间的链接,因此我们的树核空间也包含关系结构特征。我们针对不同的TREC QA语料库以及较新的基于Wikipedia的数据集WikiQA进行了自动答案选择基准测试的模型的广泛比较实验,该数据集已广泛用于测试句子重排词。结果一致表明,我们的结构语义模型在段落排名方面达到了最新水平。特别是,我们得出以下重要发现:(i)关系句法结构对于取得卓越的结果至关重要; (ii)例如,在句子重排的情况下,使用依赖树训练的模型可以胜过使用浅树训练的模型; (iii)通过焦点和问题分类器自动生成的外部知识非常有效; (iv)由LD得出并包含在句法结构中的语义信息可以用来代替上述分类器提供的知识。这是一个显着的优势,因为它使我们的模型能够增加新域的覆盖范围和可移植性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号