首页> 外文OA文献 >Efficient dictionary compression for processing RDF big data using Google BigQuery
【2h】

Efficient dictionary compression for processing RDF big data using Google BigQuery

机译:使用Google BigQuery处理RDF大数据的有效字典压缩

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The Resource Description Framework (RDF) data model, is used on the Web to express billions of structured statements in a wide range of topics, including government, publications, life sciences, etc. Consequently, processing and storing this data requires the provision of high specification systems, both in terms of storage and computational capabilities. On the other hand, cloud-based big data services such as Google BigQuery can be used to store and query this data without any upfront investment. Google BigQuery pricing is based on the size of the data being stored or queried, but given that RDF statements contain long Uniform Resource Identifiers (URIs), the cost of query and storage of RDF big data can increase rapidly. In this paper we present and evaluate a novel and efficient dictionary compression algorithm which is faster, generates small dictionaries that can fit in memory and results in better compression rate when compared with other large scale RDF dictionary compression. Consequently, our algorithm also reduces the BigQuery storage and query cost
机译:资源描述框架(RDF)数据模型在Web上用于表达广泛主题(包括政府,出版物,生命科学等)中的数十亿条结构化语句。因此,处理和存储此数据需要提供大量信息。在存储和计算能力方面的规范系统。另一方面,可以使用基于云的大数据服务(例如Google BigQuery)来存储和查询此数据,而无需任何前期投资。 Google BigQuery的定价基于要存储或查询的数据的大小,但是鉴于RDF语句包含较长的统一资源标识符(URI),因此查询和存储RDF大数据的成本可能会迅速增加。在本文中,我们提出并评估了一种新颖而有效的字典压缩算法,与其他大型RDF字典压缩相比,该算法速度更快,生成的小字典可以存储在内存中,并且压缩率更高。因此,我们的算法还减少了BigQuery的存储和查询成本

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号