Concept-Based Information Retrieval Using Explicit Semantic Analysis

OFER EGOZI; SHAUL MARKOVITCH; EVGENIY GABRILOVICH

首页> 外文期刊>ACM Transactions on Information Systems >Concept-Based Information Retrieval Using Explicit Semantic Analysis

【24h】

Concept-Based Information Retrieval Using Explicit Semantic Analysis

机译：基于显式语义分析的基于概念的信息检索

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information retrieval systems traditionally rely on textual keywords to index and retrieve documents. Keyword-based retrieval may return inaccurate and incomplete results when different keywords are used to describe the same concept in the documents and in the queries. Furthermore, the relationship between these related keywords may be semantic rather than syntactic, and capturing it thus requires access to comprehensive human world knowledge. Concept-based retrieval methods have attempted to tackle these difficulties by using manually built thesauri, by relying on term cooccurrence data, or by extracting latent word relationships and concepts from a corpus. In this article we introduce a new concept-based retrieval approach based on Explicit Semantic Analysis (ESA), a recently proposed method that augments keyword-based text representation with concept-based features, automatically extracted from massive human knowledge repositories such as Wikipedia. Our approach generates new text features automatically, and we have found that high-quality feature selection becomes crucial in this setting to make the retrieval more focused. However, due to the lack of labeled data, traditional feature selection methods cannot be used, hence we propose new methods that use self-generated labeled training data. The resulting system is evaluated on several TREC datasets, showing superior performance over previous state-of-the-art results.

机译：传统上，信息检索系统依靠文本关键字来索引和检索文档。当不同的关键字用于描述文档和查询中的相同概念时，基于关键字的检索可能会返回不准确和不完整的结果。此外，这些相关关键字之间的关系可能是语义性的，而不是句法性的，因此要捕获它，就需要获得全面的人类世界知识。基于概念的检索方法已尝试通过使用人工构建的叙词表，依靠术语同现数据或从语料库中提取潜在的单词关系和概念来解决这些难题。在本文中，我们介绍了一种基于显式语义分析（ESA）的基于概念的新检索方法，该方法是最近提出的一种方法，该方法利用基于概念的功能增强了基于关键字的文本表示，并从诸如Wikipedia的大量人类知识库中自动提取了该方法。我们的方法自动生成新的文本特征，并且我们发现，高质量的特征选择在此设置中变得至关重要，以使检索更加集中。然而，由于缺乏标记数据，传统的特征选择方法无法使用，因此我们提出了使用自行生成的标记训练数据的新方法。在多个TREC数据集上对生成的系统进行了评估，显示出比以前的最新结果更好的性能。

著录项

来源
《ACM Transactions on Information Systems》 |2011年第2期|p.43-76|共34页
作者
OFER EGOZI; SHAUL MARKOVITCH; EVGENIY GABRILOVICH;
展开▼
作者单位

Department of Computer Science, The Technion, Haifa, 32000,Israel;

Department of Computer Science, The Technion, Haifa, 32000,Israel;

ahoo! Research, 4301 Great America Parkway, Santa Clara, CA 95054;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
concept-based retrieval; explicit semantic analysis; feature selection; semantic search;

机译：基于概念的检索;显式语义分析;特征选择;语义搜索;

相似文献

外文文献
中文文献
专利

1. Ontology Concept-Based Management and Semantic Retrieval of Satellite Data [J] . Sunitha Abburu, Nitant Dube Journal of Intelligent Systems . 2017,第2期

机译：基于本体的概念的管理和语义检索卫星数据
2. Enhancing Information Retrieval Through Concept-Based Language Modeling and Semantic Smoothing [J] . Lynda Said Lhadj, Mohand Boughanem, Karima Amrouche Journal of the American Society for Information Science and Technology . 2016,第12期

机译：通过基于概念的语言建模和语义平滑来增强信息检索
3. A unified image retrieval framework on local visual and semantic concept-based feature spaces [J] . Md. Mahmudur Rahman, Prabir Bhattacharya, Bipin C. Desai Journal of visual communication & image representation . 2009,第7期

机译：基于局部视觉和语义概念的特征空间的统一图像检索框架
4. Concept-based document models using explicit semantic analysis [C] . Luo Jing, Meng Bo, Tu Xinhui, 2012 IEEE International Conference on Granular Computing. . 2012

机译：使用显式语义分析的基于概念的文档模型
5. Creating New Concept-Based Representations for Superior Text Analysis and Retrieval [D] . Shalaby, Walid. 2018

机译：创建基于概念的新表示形式以进行出色的文本分析和检索
6. Using the LOINC Semantic Structure to Integrate Community-based Survey Items into a Concept-based Enterprise Data Dictionary to Support Comparative Effectiveness Research [O] . Manuel C. Co Jr., Bernadette Boden-Albala, Leigh Quarles, 2012

机译：使用LOINC语义结构将基于社区的调查项目集成到基于概念的企业数据字典中以支持比较有效性研究
7. Concept-Based Information Retrieval Using Explicit Semantic Analysis [O] . Ofer Egozi, Shaul Markovitch, Evgeniy Gabrilovich 2013

机译：基于概念的信息检索使用显式语义分析
8. KISTI at TREC 2014 Clinical Decision Support Track: Concept-based Document Re-ranking to Biomedical Information Retrieval. [R] . Oh, H., Jung, Y. 2014

机译：KIsTI在TREC 2014临床决策支持轨道：基于概念的文件重新排序到生物医学信息检索。

Concept-Based Information Retrieval Using Explicit Semantic Analysis

摘要

著录项

相似文献

相关主题

期刊订阅