Embodiments for workload management by aggregating locality information for a set of files in a cluster of hosts, from a file level to a level of the set of files in a cluster of hosts. To facilitate workload scheduling in the cluster, a subset of the set of files is selected. A set of storage size counters, each assigned to a host in the cluster, is reset. An overall storage size counter is reset, and the files in the subset of the set of files are scanned. For each scanned file, locality information of the file is retrieved and added to the storage size counters of the hosts, and a total size of the file is added to the overall storage size counter. An output proportion of the storage size counter of each host is then computed from the overall storage size counter.
展开▼