【24h】

Heavy-Tailed Symmetric Stochastic Neighbor Embedding

机译:重尾对称随机邻居嵌入

获取原文

摘要

Stochastic Neighbor Embedding (SNE) has shown to be quite promising for data visualization. Currently, the most popular implementation, t-SNE, is restricted to a particular Student t-distribution as its embedding distribution. Moreover, it uses a gradient descent algorithm that may require users to tune parameters such as the learning step size, momentum, etc., in finding its optimum. In this paper, we propose the Heavy-tailed Symmetric Stochastic Neighbor Embedding (HSSNE) method, which is a generalization of the t-SNE to accommodate various heavy-tailed embedding similarity functions. With this generalization, we are presented with two difficulties. The first is how to select the best embedding similarity among all heavy-tailed functions and the second is how to optimize the objective function once the heavy-tailed function has been selected. Our contributions then are: (1) we point out that various heavy-tailed embedding similarities can be characterized by their negative score functions. Based on this finding, we present a parameterized subset of similarity functions for choosing the best tail-heaviness for HSSNE; (2) we present a fixed-point optimization algorithm that can be applied to all heavy-tailed functions and does not require the user to set any parameters; and (3) we present two empirical studies, one for unsupervised visualization showing that our optimization algorithm runs as fast and as good as the best known t-SNE implementation and the other for semi-supervised visualization showing quantitative superiority using the homogeneity measure as well as qualitative advantage in cluster separation over t-SNE.
机译:随机邻居嵌入(SNE)已被证明对于数据可视化很有前途。当前,最流行的实现方式t-SNE仅限于特定的学生t分布作为其嵌入分布。此外,它使用梯度下降算法,可能要求用户在寻找最佳参数时调整诸如学习步长,动量等参数。在本文中,我们提出了重尾对称随机邻居嵌入(HSSNE)方法,它是t-SNE的概括,以适应各种重尾嵌入相似性函数。通过这种概括,我们面临两个困难。第一个是如何在所有重尾函数中选择最佳的嵌入相似度,第二个是一旦选择了重尾函数,如何优化目标函数。那么我们的贡献是:(1)我们指出,各种重尾嵌入相似性可以由它们的负分函数来表征。基于这一发现,我们提出了一个参数化的相似度函数子集,用于为HSSNE选择最佳的尾部重量。 (2)我们提出了一种定点优化算法,该算法可以应用于所有重尾函数,并且不需要用户设置任何参数; (3)我们进行了两项实证研究,一项用于无监督的可视化研究,表明我们的优化算法的运行速度与最著名的t-SNE实现速度一样快,另一项用于半监督的可视化,该研究也显示了使用同质性度量的数量优势。与t-SNE相比,在簇分离方面具有定性优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号