首页> 外文会议>Advances in Information Retrieval >Evaluating Text Representations for Retrieval of the Best Group of Documents
【24h】

Evaluating Text Representations for Retrieval of the Best Group of Documents

机译:评估文本表示形式以检索最佳文档组

获取原文
获取原文并翻译 | 示例

摘要

Cluster retrieval assumes that the probability of relevance of a document should depend on the relevance of other similar documents to the same query. The goal is to find the best group of documents. Many studies have examined the effectiveness of this approach, by employing different retrieval methods or clustering algorithms, but few have investigated text representations. This paper revisits the problem of retrieving the best group of documents, from the language-modeling perspective. We analyze the advantages and disadvantages of a range of representation techniques, derive features that characterize the good document groups, and experiment with a new probabilistic representation as a first step toward incorporating these features. Empirical evaluation demonstrates that the relationship between documents can be leveraged in retrieval when a good representation technique is available, and that retrieving the best group of documents can be more effective than retrieving individual documents.
机译:聚类检索假设文档的相关概率应取决于其他相似文档与同一查询的相关性。目的是找到最佳的文档组。许多研究已经通过采用不同的检索方法或聚类算法来检验了这种方法的有效性,但是很少研究文本表示。本文从语言建模的角度重新审视了检索最佳文档组的问题。我们分析了各种表示技术的优缺点,得出了表征良好文档组的特征,并尝试了一种新的概率表示,作为并入这些功能的第一步。经验评估表明,当可以使用良好的表示技术时,可以利用文档之间的关系来进行检索,并且检索最佳的文档组比检索单个文档更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号