Mining document, concept, and term associations for effective biomedical retrieval: introducing MeSH-enhanced retrieval models

Mao Jin; Lu Kun; Mu Xiangming; Li Gang

首页> 外文期刊>Information retrieval >Mining document, concept, and term associations for effective biomedical retrieval: introducing MeSH-enhanced retrieval models

【24h】

Mining document, concept, and term associations for effective biomedical retrieval: introducing MeSH-enhanced retrieval models

机译：挖掘文档，概念和术语关联以进行有效的生物医学检索：引入MeSH增强的检索模型

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Manually assigned subject terms, such as Medical Subject Headings (MeSH) in the health domain, describe the concepts or topics of a document. Existing information retrieval models do not take full advantage of such information. In this paper, we propose two MeSH-enhanced (ME) retrieval models that integrate the concept layer (i.e. MeSH) into the language modeling framework to improve retrieval performance. The new models quantify associations between documents and their assigned concepts to construct conceptual representations for the documents, and mine associations between concepts and terms to construct generative concept models. The two ME models reconstruct two essential estimation processes of the relevance model (Lavrenko and Croft 2001) by incorporating the document-concept and the concept-term associations. More specifically, in Model 1, language models of the pseudo-feedback documents are enriched by their assigned concepts. In Model 2, concepts that are related to users' queries are first identified, and then used to reweight the pseudo-feedback documents according to the document-concept associations. Experiments carried out on two standard test collections show that the ME models outperformed the query likelihood model, the relevance model (RM3), and an earlier ME model. A detailed case analysis provides insight into how and why the new models improve/worsen retrieval performance. Implications and limitations of the study are discussed. This study provides new ways to formally incorporate semantic annotations, such as subject terms, into retrieval models. The findings of this study suggest that integrating the concept layer into retrieval models can further improve the performance over the current state-of-the-art models.

机译：手动分配的主题词，例如健康领域中的医学主题词（MeSH），描述了文档的概念或主题。现有的信息检索模型不能充分利用这些信息。在本文中，我们提出了两个MeSH增强（ME）检索模型，这些模型将概念层（即MeSH）集成到语言建模框架中以提高检索性能。新模型量化了文档及其分配的概念之间的关联以构建文档的概念表示，并挖掘了概念与术语之间的关联以构建生成性概念模型。这两个ME模型通过合并文档概念和概念术语关联，重构了相关性模型的两个基本估计过程（Lavrenko和Croft 2001）。更具体地说，在模型1中，伪反馈文档的语言模型通过其分配的概念得以丰富。在模型2中，首先确定与用户查询有关的概念，然后根据文档概念的关联来对伪反馈文档进行加权。在两个标准测试集合上进行的实验表明，ME模型优于查询似然模型，相关性模型（RM3）和早期的ME模型。详细的案例分析可洞悉新模型如何以及为何改善/恶化检索性能。讨论了研究的意义和局限性。这项研究提供了将语义注释（例如主题词）正式纳入检索模型的新方法。这项研究的结果表明，将概念层集成到检索模型中可以进一步提高性能，优于当前的最新模型。

著录项

来源
《Information retrieval》 |2015年第5期|413-444|共32页
作者
Mao Jin; Lu Kun; Mu Xiangming; Li Gang;
展开▼
作者单位

Wuhan Univ, Ctr Studies Informat Resources, Wuhan 430072, Hubei, Peoples R China;

Univ Oklahoma, Sch Lib & Informat Studies, Norman, OK 73019 USA;

Univ Wisconsin, Sch Informat Studies, Milwaukee, WI 53211 USA;

Wuhan Univ, Ctr Studies Informat Resources, Wuhan 430072, Hubei, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Relevance model; Concept; MeSH-enhanced retrieval models; Health information retrieval;

机译：相关模型概念概念MeSH增强检索模型健康信息检索;

相似文献

外文文献
中文文献
专利

1. Modeling and mining term association for improving biomedical information retrieval performance [J] . Qinmin Hu, Jimmy Xiangji Huang, Xiaohua Hu BMC Bioinformatics . 2012,第SUPPLEMENTa9期

机译：建模和挖掘术语关联以提高生物医学信息检索性能
2. Improved biomedical document retrieval system with PubMed term statistics and expansions [J] . Huian Li, Jake Yue Chen International journal of computational intelligence in bioinformatics and systems biology . 2009,第1期

机译：具有PubMed术语统计和扩展功能的改进的生物医学文献检索系统
3. Implementation of a High-Performance Answer Snippet Retrieval System Based on Multiple Ranking Models for Biomedical Documents [J] . Advanced Science Letters . 2017,第10期

机译：基于生物医学文档的多个排名模型实现高性能应答片段检索系统
4. Introducing the Concept of Back-Inking as an Efficient Model for Document Retrieval (Image Reconstruction) [C] . Mohammad A. ALGhalayini International joint conference on computer, information, systems sciences, and engineering . 2013

机译：引入后墨的概念作为文档检索的有效模型（图像重建）
5. New Document-context Term Weights And Clustering For Information Retrieval. [D] . Dang, Edward Kai Fung. 2010

机译：新的文档上下文术语权重和信息检索聚类。
6. Modeling and mining term association for improving biomedical information retrieval performance [O] . Qinmin Hu, Jimmy Xiangji Huang, Xiaohua Hu 2012

机译：建模和挖掘术语关联以提高生物医学信息检索性能
7. Modeling and mining term association for improving biomedical information retrieval performance [O] . 2012

机译：建模和挖掘术语关联以提高生物医学信息检索性能
8. KISTI at TREC 2014 Clinical Decision Support Track: Concept-based Document Re-ranking to Biomedical Information Retrieval. [R] . Oh, H., Jung, Y. 2014

机译：KIsTI在TREC 2014临床决策支持轨道：基于概念的文件重新排序到生物医学信息检索。

Mining document, concept, and term associations for effective biomedical retrieval: introducing MeSH-enhanced retrieval models

摘要

著录项

相似文献

相关主题

期刊订阅