In a cloud data center, it is common for a storage system to be shared by front-end, user-interacting applications and back-end, data-intensive applications running on different virtual machines (VMs). Although it is necessary to meet the latency requirements of I/O streams generated by the VMs that execute the front-end applications, this can be difficult because: (1) often their latency requirements are specified at percentiles and (2) some of these streams issue requests in bursts. This paper proposes ~2TL, a scheduling algorithm designed to meet the latency requirements of these applications. To meet latency requirements at user-specified percentiles, ~2TL continuously controls the number of requests that expire before being serviced. To handle request bursts, it proactively adjusts scheduling parameters to avoid violations to latency requirements. We evaluated ~2TL on a simulated RAID storage system using workloads that consist of concurrent I/O streams that cover a wide range of access characteristics, including burstiness. In this evaluation, latency requirements were specified at various percentiles found in the literature. When the storage system was sufficiently provisioned, it met the latency requirements of each workload without degrading storage system performance.
展开▼