A Study of Document-Context Models in Information Retrieval.

机译：信息检索中的文档上下文模型研究。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this thesis we study new retrieval models which simulate the "local" relevance decision-making for every term location in a document, these local relevance decisions are then combined as the "document-wide" relevance decision for the document. Local relevance decision for a term t occurred at the k-th location in a document is made by considering the document-context which is the window of terms centred at the term t at the k-th location. Therefore, different relevance scores (preferences) are obtained for the same term t at different locations in a document depending on its document-contexts. This differs from traditional models which term t receives the same score disregard of its locations in a document.;A hybrid document-context model is studied which is the combination of various existing effective models and techniques. It estimates the relevance decision preference of document-contexts as the log-odds and combines the estimated preferences using different types of aggregation operators that comply with the relevance decision principles. The model is evaluated using retrospective experiments to reveal the potential of the model. Besides retrospective experiments, we also use top 20 documents from the initial ranked list to perform relevance feedback experiments with a probabilistic document-context model and the results are promising.;We also show that when the size of the document-contexts is shrunk to unity, the document-context model is simplified to a basic ranking formula that directly corresponds to the TF-IDF term weights. Thus TF-IDF term weights can be interpreted as making relevance decisions. This helps to establish a unifying perspective about information retrieval as relevance decision-making and to develop advance TF-IDF-related term weights for future elaborate retrieval models.;Lastly, we develop a new relevance feedback algorithm by splitting the ranked document list into multiple lists of document-contexts. The judgement of relevance of the documents is not done sequentially. This is called active feedback and we show that our new relevance feedback algorithm obtained better results than the conventional relevance feedback algorithm and this is done more reliably than a maximal marginal relevance (MMR) method which does not use document-contexts.

机译：在本文中，我们研究了新的检索模型，这些模型模拟了文档中每个术语位置的“本地”相关性决策，然后将这些本地相关性决策合并为文档的“整个文档”相关性决策。通过考虑文档上下文来确定在文档中第k个位置出现的术语t的局部相关性，该上下文是在t处位于第k个位置的术语窗口。因此，根据文档上下文，在文档中不同位置获得的相同术语t的相关性得分（偏好）不同。这与传统模型不同，传统模型不考虑文档在文档中的位置而获得相同的分数。研究了混合文档-上下文模型，该模型是各种现有有效模型和技术的结合。它以对数形式估计文档上下文的相关性决策优先级，并使用符合相关性决策原则的不同类型的聚合运算符来组合估计的优先级。使用回顾性实验评估模型，以揭示模型的潜力。除回顾性实验外，我们还使用初始排名列表中的前20个文档通过概率文档-上下文模型进行相关性反馈实验，并且结果令人鼓舞。我们还表明，当文档上下文的大小缩小到统一时，则将文档上下文模型简化为直接与TF-IDF术语权重相对应的基本排名公式。因此，TF-IDF术语权重可以解释为做出相关性决策。这有助于建立有关信息检索作为相关决策的统一观点，并为将来的详细检索模型开发与TF-IDF相关的高级术语权重。最后，我们通过将已排序的文档列表分为多个来开发新的相关性反馈算法。文档上下文列表。文件的相关性判断不是顺序进行的。这被称为主动反馈，我们证明了我们的新的相关性反馈算法比常规的相关性反馈算法获得了更好的结果，并且比不使用文档上下文的最大边际相关性（MMR）方法更可靠地完成了此操作。

著录项

作者
Wu, Ho Chung.;
展开▼
作者单位

Hong Kong Polytechnic University (Hong Kong).;

展开▼
授予单位 Hong Kong Polytechnic University (Hong Kong).;
学科 Computer Science.
学位 Ph.D.
年度 2011
页码 166 p.
总页数 166
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A retrospective study of a hybrid document-context based retrieval model [J] . H.C. Wu, Robert W.P. Luk, K.F. Wong, Information Processing & Management . 2007,第5期

机译：基于混合文档-上下文的检索模型的回顾性研究
2. Linear models of cumulative distribution function for content-based medical image retrieval. [J] . Manjunath KN, Renuka A, Niranjan UC Journal of medical systems . 2007,第6期

机译：基于内容的医学图像检索的累积分布函数的线性模型。
3. From data to knowledge in e-health applications: an integrated system for medical information modelling and retrieval. [J] . Dotsika F Medical informatics and the Internet in medicine . 2003,第4期

机译：从数据到电子医疗应用中的知识：用于医学信息建模和检索的集成系统。
4. Combined Modeling And Experimental Studies To Optimize The Balance Between Fold Crack Resistance And Stiffness For Multilayered Paper Coatings - Part 1: Introduction And Modeling Studies [C] . Pekka Salminen, Roger Carlsson, Stefan Sandas, PaperCon '08;Paper conference and trade show . 2008

机译：建模与实验研究相结合，以优化多层纸涂料的耐折裂性和刚度之间的平衡-第1部分：简介和建模研究
5. A study of language models for exploiting user feedback in information retrieval. [D] . Tan, Bin. 2009

机译：用于在信息检索中利用用户反馈的语言模型的研究。
6. Guideline classification to assist modeling authoring implementation and retrieval. [O] . E. Bernstam, N. Ash, M. Peleg, 2000

机译：指南分类以帮助建模创作实施和检索。
7. A retrospective study of a hybrid document-context based retrieval model [O] . Wu HC, Luk RWP, Wong KF, 2007

机译：基于混合文档-上下文的检索模型的回顾性研究
8. Lightening the Load of Document Smoothing for Better Language Modeling Retrieval. [R] . Smucker, M. D., Allan, J. 2006

机译：减轻文档平滑负荷，提高语言模型检索能力。

A Study of Document-Context Models in Information Retrieval.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅