With the increasing interest in public cloud infrastructures, a crucial need has evolved for trustworthy remote data storage and processing. At the same time the increased demand for storage, from backup services, to cloud infrastructures has driven the use of deduplication to eliminate redundant data in the cloud as a means to improve storage efficiency, and save bandwidth, reducing the cost of deploying and maintaining cloud infrastructures. It is assumed that it is in the best interest of both the cloud provider, and the customer to perform cross-silo deduplication, i.e. deduplicating across user-silos within the cloud. In this paper we challenge this assumption, providing experimental data which examines the benefits of cross-silo deduplication using real data for varying silo sizes, and data similarity. We also present an in-depth analysis of the issues inherent to cross-silo deduplication, detailing the attack vectors it enables through inadvertent data leakage, including a novel pair of attacks ignored by previous works. We then discuss solutions presented in the literature, the problems inherent in these solutions, and challenge the notion that cross-silo deduplication is worth the cost in lost security.
展开▼