The goal of this work is to characterize scientific data transfers and to determine the suitability of dynamic virtual circuit service for these transfers instead of the currently used IP-routed service. Specifically, logs collected by servers executing a commonly used scientific data transfer application, GridFTP, are obtained from three US super-computing/scientific research centers, NERSC, SLAC, and NCAR, and analyzed. Dynamic virtual circuit (VC) service, a relatively new offering from providers such as ESnet and Internet2, allows for the selection of a path on which a rate-guaranteed connection is established prior to data transfer. Given VC setup overhead, the first analysis of the GridFTP transfer logs characterizes the duration of sessions, where a session consists of multiple back-to-back transfers executed in batch mode between the same two GridFTP servers. Of the NCAR-NICS sessions analyzed, 56% of all sessions (90% of all transfers) would have been long enough to be served with dynamic VC service. An analysis of transfer logs across four paths, NCAR-NICS, SLAC-BNL, NERSC-ORNL and NERSC-ANL, shows significant throughput variance, where NICS, BNL, ORNL, and ANL are other US national laboratories. For example, on the NERSC-ORNL path, the inter-quartile range was 695 Mbps, with a maximum value of 3.64 Gbps and a minimum value of 758 Mbps. An analysis of the impact of various factors that are potential causes of this variance is also presented.
展开▼