Passage-Based Document Retrieval as a Tool for Text Mining with User's Information Needs

机译：基于段落的文档检索作为文本挖掘的工具，具有用户信息需求

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Document retrieval can be considered as a basic but important tool for text mining that is capable of taking a user's information need into account. However, document retrieval is a hard task if multi-topic lengthy documents have to be retrieved with a very short description (a few keywords) of the information need. In this paper, we focus on this problem which is typical in real world applications. We experimentally validate that passage-based document retrieval is advantageous in such circumstances as compared to conventional document retrieval. Passage-based document retrieval is a kind of document retrieval which takes into account only small fractions (passages) of documents to judge the document relevance to the information need. As a passage-based method, we employ the method based on density distributions of keywords. This is compared with the following three conventional methods for document retrieval: the vector space model, pseudo-feedback, and latent semantic indexing. Experimental results show that the passage-based method is superior to the conventional methods if long documents have to be retrieved by short queries.

机译：文档检索可以被视为能够考虑用户信息的文本挖掘的基本但重要的工具。然而，如果必须使用信息的非常简短的描述（几个关键字）需要检索多主题冗长文档，则文档检索是一个艰难的任务。在本文中，我们专注于这个问题在现实世界应用中的典型问题。我们通过传统文档检索相比，我们通过实验验证基于段落的文档检索在这种情况下是有利的。基于段落的文档检索是一种文档检索，其仅考虑了文件的小分数（通道），以判断与信息需要的文档相关性。作为基于段落的方法，我们采用了基于关键字的密度分布的方法。将其与以下三种传统方法进行比较，用于文件检索：矢量空间模型，伪反馈和潜在语义索引。实验结果表明，如果必须通过短查询检索长的文件，则基于段的方法优于传统方法。

著录项

来源
《International conference on discovery science》|1998年||共15页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化系统理论;
关键词

相似文献

外文文献
中文文献
专利

1. Tret: A Text Retrieval Efficiency Testing Tool For Different Document Types/Formats And Calculating Evaluation Measures For Xml Retrieval [J] . Guozhen Cheng Advances in computational sciences and technology . 2018,第3期

机译：Tret：一种用于不同文档类型/格式的文本检索效率测试工具，并为Xml检索计算评估方法
2. Utilizing passage-based language models for ad hoc document retrieval [J] . Michael Bendersky, Oren Kurland Information retrieval . 2010,第2期

机译：利用基于段落的语言模型进行临时文档检索
3. Passage-Based Text Summarization for Legal Information Retrieval [J] . Kanapala Ambedkar, Jannu Srikanth, Pamula Rajendra Arabian Journal for Science and Engineering . 2019,第11期

机译：基于段落的文本摘要，用于法律信息检索
4. Passage-Based Document Retrieval as a Tool for Text Mining with User's Information Needs [C] . Koichi Kise, Markus Junker, Andreas Dengel, Discovery Science . 2001

机译：基于段落的文档检索作为满足用户信息需求的文本挖掘工具
5. Text association mining with cross-sentence inference, structure-based document model and multi-relational text mining. [D] . Thaicharoen, Supphachai. 2009

机译：带有跨句推理的文本关联挖掘，基于结构的文档模型和多关系文本挖掘。
6. PubstractHelper: A Web-based Text-Mining Tool for Marking Sentences in Abstracts from PubMed Using Multiple User-Defined Keywords [O] . Chou-Cheng Chen, Chung-Liang Ho 2014

机译：PubstractHelper：基于Web的文本挖掘工具用于使用多个用户定义的关键字标记PubMed中的摘要中的句子
7. Utilizing Passage-Based Language Models for Document Retrieval [O] . Michael Bendersky, Oren Kurland 2008

机译：利用基于段落的语言模型进行文档检索

Passage-Based Document Retrieval as a Tool for Text Mining with User's Information Needs

摘要

著录项

相似文献

相关主题

期刊订阅