首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >A Scalable Data Chunk Similarity Based Compression Approach for Efficient Big Sensing Data Processing on Cloud
【24h】

A Scalable Data Chunk Similarity Based Compression Approach for Efficient Big Sensing Data Processing on Cloud

机译:基于可扩展数据块相似度的压缩方法,用于云上高效的大传感数据处理

获取原文
获取原文并翻译 | 示例

摘要

Big sensing data is prevalent in both industry and scientific research applications where the data is generated with high volume and velocity. Cloud computing provides a promising platform for big sensing data processing and storage as it provides a flexible stack of massive computing, storage, and software services in a scalable manner. Current big sensing data processing on Cloud have adopted some data compression techniques. However, due to the high volume and velocity of big sensing data, traditional data compression techniques lack sufficient efficiency and scalability for data processing. Based on specific on-Cloud data compression requirements, we propose a novel scalable data compression approach based on calculating similarity among the partitioned data chunks. Instead of compressing basic data units, the compression will be conducted over partitioned data chunks. To restore original data sets, some restoration functions and predictions will be designed. MapReduce is used for algorithm implementation to achieve extra scalability on Cloud. With real world meteorological big sensing data experiments on U-Cloud platform, we demonstrate that the proposed scalable compression approach based on data chunk similarity can significantly improve data compression efficiency with affordable data accuracy loss.
机译:大的传感数据在工业和科研应用中都很普遍,在这些应用中,数据的生成量和速度很高。云计算为大传感数据处理和存储提供了一个有前途的平台,因为它以可伸缩的方式提供了海量计算,存储和软件服务的灵活堆栈。当前在Cloud上的大传感数据处理已经采用了一些数据压缩技术。但是,由于大的传感数据量大且速度快,传统的数据压缩技术缺乏足够的效率和可伸缩性来处理数据。基于特定的云端数据压缩要求,我们基于计算分区数据块之间的相似性,提出了一种新颖的可伸缩数据压缩方法。代替压缩基本数据单元,压缩将在分区的数据块上进行。为了恢复原始数据集,将设计一些恢复功能和预测。 MapReduce用于算法实现,以在Cloud上实现额外的可伸缩性。通过在U-Cloud平台上进行的现实世界气象大感测数据实验,我们证明了基于数据块相似性的可扩展压缩方法可以显着提高数据压缩效率,并且可以承受可承受的数据精度损失。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号