Despite many efforts over the past decade, the ability to generate topological maps of the Internet at the router-level accurately and in a timely fashion remains elusive. Mapping campaigns commonly involve traceroute-like probing that are usually non-adaptive and incomplete, thus revealing only a portion of the underlying topology. In this paper we demonstrate that standard probing methods yield datasets that implicitly contain information about much more than just the directly observed links and routers. Each probe, in addition to the underlying domain knowledge, returns information that places constraints on the underlying topology, and by integrating a large number of such constraints it is possible to accurately infer the existence of unseen components of the Internet. We describe DomainImpute, a novel data analysis methodology designed to accurately infer the unseen hop-count distances between observed routers. We use both synthetic and a large empirical dataset to validate the proposed methods. On our empirical real world dataset, we show that our methods can estimate over 55% of the unseen distances between observed routers to within a one-hop error.
展开▼