To avoid the undesired effects of distance concentration in high-dimensional spaces, previous work has already advocated the use of fractional ℓ~p norms instead of the ubiquitous Euclidean norm. Closely related to concentration is the emergence of hub and anti-hub objects. Hub objects have a small distance to an exceptionally large number of data points while anti-hubs lie far from all other data points. The contribution of this work is an empirical examination of concentration and hubness, resulting in an unsupervised approach for choosing an ℓ~p norm by minimizing hubs while simultaneously maximizing nearest neighbor classification.
展开▼