首页> 外文会议>Intelligent Information Processing and Web Mining; Advances in Soft Computing >Towards a Framework Design of a Retrieval Document System Based on Rhetorical Structure Theory and Cue Phrases
【24h】

Towards a Framework Design of a Retrieval Document System Based on Rhetorical Structure Theory and Cue Phrases

机译:基于修辞结构理论和提示短语的检索文档系统框架设计

获取原文

摘要

The amount of information available on the Internet is currently growing at an incredible rate. However, the lack of efficient indexing is still a major barrier to effective information retrieval on the web. This paper presents the design of a technique for content-based indexing and retrieval of relevant documents from a large collection of documents such as the Internet. The technique aims at improving the quality of retrieval by capturing the semantics of the documents. It introduces a thematic relationship between parts of text using a linguistics theory called Rhetorical Structure Theory (RST) based on cue phrases to determine the set of rhetorical relations. Once these structures are determined, they can be saved into a database. We can then query that collection using not only keywords, as traditional Information retrieval systems, but also rhetorical relations. The indexing and retrieval technique described in this paper is under development and initial results on a small number of documents have been very successful.
机译:当前,Internet上可用的信息量正以惊人的速度增长。但是,缺乏有效的索引仍然是在网络上有效检索信息的主要障碍。本文提出了一种技术的设计,该技术用于基于内容的索引以及从大量文档(例如Internet)中检索相关文档。该技术旨在通过捕获文档的语义来提高检索质量。它使用称为“修辞结构理论”(RST)的语言学理论基于提示短语来确定文本之间的主题关系,从而确定修辞关系的集合。一旦确定了这些结构,就可以将它们保存到数据库中。然后,我们不仅可以使用关键字(如传统的信息检索系统)来查询该集合,还可以使用修辞关系。本文描述的索引和检索技术正在开发中,并且在少量文档上的初步结果非常成功。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号