首页> 外文会议>CIKM 10;ACM conference on information and knowledge management >Efficient Wikipedia-Based Semantic Interpreter by Exploiting Top-k Processing
【24h】

Efficient Wikipedia-Based Semantic Interpreter by Exploiting Top-k Processing

机译:利用Top-k处理技术,基于Wikipedia的高效语义解释器

获取原文

摘要

Proper representation of the meaning of texts is crucial to enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching. Representing of texts in the concept-space derived from Wikipedia has received growing attention recently, due to its comprehensiveness and expertise. This concept-based representation is capable of extracting semantic relatedness between texts that cannot be deduced with the bag of words model. A key obstacle, however, for using Wikipedia as a semantic interpreter is that the sheer size of the concepts derived from Wikipedia makes it hard to efficiently map texts into concept-space. In this paper, we develop an efficient algorithm which is able to represent the meaning of a text by using the concepts that best match it. In particular, our approach first computes the approximate top-fc concepts that are most relevant to the given text. We then leverage these concepts for representing the meaning of the given text. The experimental results show that the proposed technique provides significant gains in execution time over current solutions to the problem.
机译:正确表示文本的含义对于增强许多数据挖掘和信息检索任务至关重要,这些任务包括聚类,计算文本之间的语义相关性以及搜索。由于其全面性和专业性,最近在来自Wikipedia的概念空间中表示文本受到了越来越多的关注。这种基于概念的表示能够提取文本之间的语义相关性,而文本袋模型无法推断这些文本之间的语义相关性。但是,将Wikipedia用作语义解释器的主要障碍是,从Wikipedia派生的概念的庞大规模使其难以有效地将文本映射到概念空间。在本文中,我们开发了一种有效的算法,该算法能够通过使用与文本最匹配的概念来表示文本的含义。特别地,我们的方法首先计算与给定文本最相关的近似top-fc概念。然后,我们利用这些概念来表示给定文本的含义。实验结果表明,与当前解决方案相比,所提出的技术在执行时间上有显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号