首页> 外文期刊>Expert Systems with Application >Microblog semantic context retrieval system based on linked open data and graph-based theory
【24h】

Microblog semantic context retrieval system based on linked open data and graph-based theory

机译:基于链接开放数据和图论的微博语义上下文检索系统

获取原文
获取原文并翻译 | 示例
           

摘要

Microblogging platforms have emerged as large collections of short documents. In fact, the provision of an effective way to retrieve short text presents a significant research challenge owing to several factors: creative language usage, high contextualization, the informal nature of micro blog posts and the limited length of this form of communication. Thus, micro blogging retrieval systems suffer from the problems of data sparseness and the semantic gap. This makes it inadequate to accurately meet users' information needs because users compose tweets using few terms and without query terms inside; thus, many relevant tweets will not be retrieved. To overcome the problems of data sparseness and the semantic gap, recent studies on content-based microblog searching have focused on adding semantics to micro posts by linking short text to knowledge bases resources. Moreover, previous studies use bag-of-concepts representation by linking named entities to their corresponding knowledge base concepts. However, bag of-concepts representation considers only concepts that match named entities and supposes that all concepts are equivalent and independent. Thus, in this paper, we present a graph-of-concepts method that considers the relationships among concepts that match named entities in short text and their related concepts and contextualizes each concept in the graph by leveraging the linked nature of DBpedia as a Linked Open Data knowledge base and graph-based centrality theory. Furthermore, we propose a similarity measure that computes the similarity between two graphs (query-tweet) by considering the relationships between the contextualized concepts. Finally, we introduce some experiment results, using a real Twitter dataset, to expose the effectiveness of our system. (c) 2016 Elsevier Ltd. All rights reserved.
机译:微博平台已经出现了许多简短的文档。实际上,由于以下几个因素,提供一种有效的检索短文本的方法提出了一项重大的研究挑战:创造性的语言使用,高度的语境化,微博客帖子的非正式性质以及这种交流形式的长度有限。因此,微博客检索系统存在数据稀疏和语义鸿沟的问题。这使得准确地满足用户的信息需求是不够的,因为用户使用很少的术语并且内部没有查询术语来撰写推文。因此,许多相关的推文将不会被检索。为了克服数据稀疏和语义鸿沟的问题,基于内容的微博搜索的最新研究集中于通过将短文本链接到知识库资源来为微博添加语义。此外,以前的研究通过将命名实体链接到其相应的知识库概念来使用概念包表示。但是,概念概念表示法仅考虑与命名实体匹配的概念,并假设所有概念都是等效且独立的。因此,在本文中,我们提出了一种概念图方法,该方法考虑了与短文本中的命名实体及其相关概念匹配的概念之间的关系,并通过利用DBpedia的链接性质将其关联到图中的每个概念。数据知识库和基于图的中心理论。此外,我们提出了一种相似性度量,该度量通过考虑上下文相关概念之间的关系来计算两个图之间的相似性(查询推文)。最后,我们使用真实的Twitter数据集介绍一些实验结果,以揭示我们系统的有效性。 (c)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号