A Discourse Search Engine Based on Rhetorical Structure Theory

机译：一种基于修辞结构理论的话语搜索引擎

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Representing a document as a bag-of-words and using keywords to retrieve relevant documents have seen a great success in large scale information retrieval systems such as Web search engines. Bag-of-words representation is computationally efficient and with proper term weighting and document ranking methods can perform surprisingly well for a simple document representation method. However, such a representation ignores the rich discourse structure in a document, which could provide useful clues when determining the relevancy of a document to a given user query. We develop the first-ever Discourse Search Engine (DSE) that exploits the discourse structure in documents to overcome the limitations associated with the bag-of-words document representations in information retrieval. We use Rhetorical Structure Theory (RST) to represent a document as a discourse tree connecting numerous elementary discourse units (EDUs) via discourse relations. Given a query, our discourse search engine can retrieve not only relevant documents to the query, but also individual statements from those relevant documents that describe some discourse relations to the query. We propose several ranking scores that consider the discourse structure in the documents to measure the relevance of a pair of EDUs to a query. Moreover, we combine those individual relevance scores using a random decision forest (RDF) model to create a single relevance score. Despite the numerous challenges of constructing a rich document representation using the discourse relations in a document, our experimental results show that it improves the F-score in an information retrieval task. We publicly release our manually annotated test collection to expedite future research in discourse-based information retrieval.

机译：将文档作为文档作为文档，并使用关键字来检索相关文档在大规模信息检索系统（如Web Search引擎）中看到了巨大的成功。单词袋式表示是计算上有效，并且具有适当的术语加权，并且文档排名方法对于简单的文档表示方法可以令人惊讶地表现出令人惊讶的。然而，这种表示忽略了文件中丰富的话语结构，当确定文档与给定用户查询的相关性时可以提供有用的线索。我们开发了首次采用文档中的话语结构的首次话语搜索引擎（DSE），以克服信息检索中与文字袋文档表示相关的限制。我们使用修辞结构理论（RST）代表作为通过话语关系连接众多基础话语单位（EDU）的话语树的文档。鉴于查询，我们的话语搜索引擎不仅可以检索对查询的相关文档，还可以检索来自这些相关文件的个人陈述，这些文件描述了对查询的一些话语关系的相关文件。我们提出了几个排名分数，以考虑文件中的话语结构，以衡量一对EDU对查询的相关性。此外，我们使用随机决策林（RDF）模型来结合那些个体相关性分数来创建单个相关评分。尽管使用文档中的话语关系构建了丰富的文档表示的挑战，但我们的实验结果表明它在信息检索任务中提高了F分。我们公开发布我们手动注释的测试集合，以加快基于话语的信息检索的未来研究。

著录项

来源
《European Conference on Information Retrieval Research》|2015年||共12页
会议地点
作者
Pascal Kuyten; Danushka Bollegala; Bernd Hollerit; Helmut Prendinger; Kiyoharu Aizawa;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G354.4-53;
关键词

相似文献

外文文献
中文文献
专利

1. Sentiment analysis based on rhetorical structure theory:Learning deep neural networks from discourse trees [J] . Kraus Mathias, Feuerriegel Stefan Expert Systems with Application . 2019,第MARa期

机译：基于修辞结构理论的情感分析：从话语树中学习深度神经网络
2. A qualitative comparison method for rhetorical structures: identifying different discourse structures in multilingual corpora [J] . Iruskieta Mikel, da Cunha Iria, Taboada Maite Language Resources and Evaluation . 2015,第2期

机译：修辞结构的定性比较方法：识别多语言语料库中的不同话语结构
3. A Review of: Classical Greek Rhetorical Theory and the Disciplining of Discourse, by David Timmerman and Edward SchiappaNew York: Cambridge University Press, 2010. ixÂ +Â 192 pp. [J] . Reviewed by Brandon Inabineta Rhetoric Society Quarterly . 2011,第4期

机译：评论：《古典希腊修辞理论与话语纪律》，戴维·蒂默曼和爱德华·希帕帕，纽约：剑桥大学出版社，2010年。ix + 192页。
4. A Discourse Search Engine Based on Rhetorical Structure Theory [C] . Pascal Kuyten, Danushka Bollegala, Bernd Hollerit, European conference on information retrieval research . 2015

机译：基于修辞结构理论的语篇搜索引擎
5. Lexis in chemical engineering discourse: Analyzing style in chemical engineering research articles through a rhetorical lens. [D] . Young, David Lamar, Jr. 2013

机译：Lexis的化学工程话语：通过修辞学角度分析化学工程研究文章中的风格。
6. Measuring discourse coherence in anomic aphasia using Rhetorical Structure Theory [O] . Anthony Pak-Hin Kong, Anastasia Linnik, Sam-Po Law, -1

机译：用修辞结构理论测量失语症的语篇连贯性
7. A discourse search engine based on rhetorical structure theory [O] . Kuyten P, Bollegala D, Hollerit B, 2015

机译：基于修辞结构理论的语篇搜索引擎

A Discourse Search Engine Based on Rhetorical Structure Theory

摘要

著录项

相似文献

相关主题

期刊订阅