Big data analysis jobs on clouds are gaining more and more popularity in recent years. It is critical but challenging to pick the right configuration for an incoming job, since the configuration space is too large, and the relationship between allocated resources and job performance is not deterministic. In this paper, we proposeSERAC3to allocate resources smartly and economically for big data clusters in community clouds.SERAC3is a system that can automatically extract representative workloads from incoming big data analysis jobs, smartly decide an optimal configuration for each job, and adjust its assigning strategy in a quasi-realtime mode. With experiments on a community cloud built onOpenStack, we show that on average,SERAC3can smartly select a configuration within 2.2% of the exact optimal one, while saving about 80.1% search cost compared to the exhaustive search.
展开▼