首页> 外文会议>Advances in Information Retrieval >Query-Based Inter-document Similarity Using Probabilistic Co-relevance Model

【24h】

Query-Based Inter-document Similarity Using Probabilistic Co-relevance Model

机译：基于概率关联模型的基于查询的文档间相似度

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Inter-document similarity is the critical information which determines whether or not the cluster-based retrieval improves the baseline. However, a theoretical work on inter-document similarity has not been investigated, even though such work can provide a principle to define a more improved similarity in a well-motivated direction. To support this theory, this paper starts from pursuing an ideal inter-document similarity that optimally satisfies the cluster-hypothesis. We propose a probabilistic principle of inter-document similarities; the optimal similarity of two documents should be proportional to the probability that they are co-relevant to an arbitrary query. Based on this principle, the study of the inter-document similarity is formulated to attack the estimation problem of the co-relevance model of documents. Furthermore, we obtain that the optimal inter-document similarity should be defined using queries as its basic unit, not terms, namely a query-based similarity. We strictly derive a novel query-based similarity from the co-relevance model, without any heuristics. Experimental results show that the new query-based inter-document similarity significantly improves the previously-used term-based similarity in the context of Voorhee's evaluation measure.

机译：文档之间的相似性是决定基于群集的检索是否改善基线的关键信息。但是，尚未研究有关文档间相似性的理论工作，即使此类工作可以提供一个原则，以在动机良好的方向上定义更好的相似性。为了支持这一理论，本文从追求理想的文档间相似度开始，该相似度可以最佳地满足聚类假设。我们提出了文档间相似性的概率原则；两个文档的最佳相似度应与它们与任意查询相关的概率成正比。基于这一原理，对文档间相似度进行了研究，以解决文档的相关度模型的估计问题。此外，我们获得了最佳文档间相似度应使用查询作为其基本单位而不是术语（即基于查询的相似度）来定义。我们严格地从互相关模型中得出一种新颖的基于查询的相似性，而没有任何启发式方法。实验结果表明，在Voorhee评估方法的背景下，新的基于查询的文档间相似度显着提高了以前使用的基于术语的相似度。

著录项

来源
《Advances in Information Retrieval》|2008年|P.684-688|共5页
会议地点 Glasgow(GB);Glasgow(GB)
作者
Seung-Hoon Na; In-Su Kang; Jong-Hyeok Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Probabilistic co-relevance for query-sensitive similarity measurement in information retrieval [J] . Seung-Hoon Na Information Processing & Management . 2013,第2期

机译：信息检索中查询敏感相似性度量的概率相关性
2. Query-based biclustering of gene expression data using Probabilistic Relational Models [J] . Hui Zhao, Lore Cloots, Tim Van den Bulcke, BMC Bioinformatics . 2011,第Supplementa1期

机译：使用概率关系模型的基于查询的基因表达数据二类聚类
3. Musical Similarity and Commonness Estimation Based on Probabilistic Generative Models of Musical Elements [J] . Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto International journal of semantic computing . 2016,第1期

机译：基于音乐元素概率生成模型的音乐相似度和共性估计
4. Query-Based Inter-document Similarity Using Probabilistic Co-relevance Model [C] . Seung-Hoon Na, In-Su Kang, Jong-Hyeok Lee European Conference on IR Research . 2008

机译：基于概率合作模型的基于文档间相似性
5. Prediction of Protein Function with a Probabilistic Model for Analysis of Sequence Similarity Networks and Genomic Context [D] . Yunes, Jeffrey Michael. 2018

机译：利用概率模型预测蛋白质相似性网络和基因组背景的蛋白质功能
6. Query-based biclustering of gene expression data using Probabilistic Relational Models [O] . Hui Zhao, Lore Cloots, Tim Van den Bulcke, 2011

机译：使用概率关系模型的基于查询的基因表达数据的聚类
7. Query-based biclustering of gene expression data using Probabilistic Relational Models [O] . Zhao, Hui, Cloots, Lore, Van den Bulcke, Tim, 2011

机译：使用概率关系模型的基于查询的基因表达数据的聚类

Query-Based Inter-document Similarity Using Probabilistic Co-relevance Model

摘要

著录项

相似文献

相关主题

期刊订阅