首页> 外文期刊>Distributed and Parallel Databases >Performance analysis of data intensive cloud systems based on data management and replication: a survey
【24h】

Performance analysis of data intensive cloud systems based on data management and replication: a survey

机译:基于数据管理和复制的数据密集型云系统的性能分析:一项调查

获取原文
获取原文并翻译 | 示例
           

摘要

As we delve deeper into the 'Digital Age', we witness an explosive growth in the volume, velocity, and variety of the data available on the Internet. For example, in 2012 about 2.5 quintillion bytes of data was created on a daily basis that originated from myriad of sources and applications including mobile devices, sensors, individual archives, social networks, Internet of Things, enterprises, cameras, software logs, etc. Such 'Data Explosions' has led to one of the most challenging research issues of the current Information and Communication Technology era: how to optimally manage (e.g., store, replicated, filter, and the like) such large amount of data and identify new ways to analyze large amounts of data for unlocking information. It is clear that such large data streams cannot be managed by setting up on-premises enterprise database systems as it leads to a large up-front cost in buying and administering the hardware and software systems. Therefore, next generation data management systems must be deployed on cloud. The cloud computing paradigm provides scalable and elastic resources, such as data and services accessible over the Internet Every Cloud Service Provider must assure that data is efficiently processed and distributed in a way that does not compromise end-users' Quality of Service (QoS) in terms of data availability, data search delay, data analysis delay, and the like. In the aforementioned perspective, data replication is used in the cloud for improving the performance (e.g., read and write delay) of applications that access data. Through replication a data intensive application or system can achieve high availability, better fault tolerance, and data recovery. In this paper, we survey data management and replication approaches (from 2007 to 2011) that are developed by both industrial and research communities. The focus of the survey is to discuss and characterize the existing approaches of data replication and management that tackle the resource usage and QoS provisioning with different levels of efficiencies. Moreover, the breakdown of both influential expressions (data replication and management) to provide different QoS attributes is deliberated. Furthermore, the performance advantages and disadvantages of data replication and management approaches in the cloud computing environments are analyzed. Open issues and future challenges related to data consistency, scalability, load balancing, processing and placement are also reported.
机译:随着我们深入研究“数字时代”,我们见证了Internet上可用数据的数量,速度和种类呈爆炸性增长。例如,在2012年,每天创建约2.5兆字节的数据,这些数据源自无数的源和应用程序,包括移动设备,传感器,个人档案,社交网络,物联网,企业,相机,软件日志等。这种“数据爆炸”导致了当前信息和通信技术时代最具挑战性的研究问题之一:如何最佳地管理(例如,存储,复制,过滤等)大量数据并确定新方法分析大量数据以解锁信息。显然,无法通过设置本地企业数据库系统来管理如此大的数据流,因为这会导致购买和管理硬件和软件系统的前期成本很高。因此,下一代数据管理系统必须部署在云上。云计算范例提供可伸缩的弹性资源,例如可通过Internet访问的数据和服务。每个云服务提供商都必须确保以不损害最终用户的服务质量(QoS)的方式有效地处理和分发数据。数据可用性,数据搜索延迟,数据分析延迟等术语。在上述观点中,数据复制被用于云中以改善访问数据的应用程序的性能(例如,读取和写入延迟)。通过复制,数据密集型应用程序或系统可以实现高可用性,更好的容错能力和数据恢复。在本文中,我们调查了工业界和研究界开发的数据管理和复制方法(从2007年到2011年)。该调查的重点是讨论和描述现有的数据复制和管理方法,这些方法可以解决具有不同效率级别的资源使用和QoS设置。此外,还讨论了两种具有影响力的表达式(数据复制和管理)的细分,以提供不同的QoS属性。此外,分析了云计算环境中数据复制和管理方法的性能优缺点。还报告了与数据一致性,可伸缩性,负载平衡,处理和放置有关的未解决问题和未来挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号