In this paper, we explore the performance of gang scheduling on acluster using the Quadrics interconnection network. In such a cluster,the scheduler can take advantage of this network's unique capabilities,including a network interface card-based processor and memory andefficient user-level communication libraries. We developed amicro-benchmark to test the scheduler's performance under variousaspects of parallel job workloads: memory usage, bandwidth andlatency-bound communication, number of processes, timeslice quantum, andmultiprogramming levels. Our experiments show that the gang schedulerperforms relatively well under most workload conditions, is largelyinsensitive to the number of concurrent jobs in the system and scalesalmost linearly with number of nodes. On the other hand, the scheduleris very sensitive to the timeslice quantum, and values under 30 secondscan incur large overheads and fairness problems
展开▼