首页> 外文期刊>SIGKDD explorations >AnnexML: Approximate Nearest Neighbor Search for Extreme Multi-label Classification
【24h】

AnnexML: Approximate Nearest Neighbor Search for Extreme Multi-label Classification

机译:附件:近似最近的邻权搜索极端多标签分类

获取原文
获取原文并翻译 | 示例
           

摘要

Extreme multi-label classification methods have been widely used in Web-scale classification tasks such as Web page tagging and product recommendation. In this paper, we present a novel graph embedding method called "AnnexML". At the training step, AnnexML constructs a k-nearest neighbor graph of label vectors and attempts to reproduce the graph structure in the embedding space. The prediction is efficiently performed by using an approximate nearest neighbor search method that efficiently explores the learned k-nearest neighbor graph in the embedding space. We conducted evaluations on several large-scale real-world data sets and compared our method with recent state-of-the-art methods. Experimental results show that our AnnexML can significantly improve prediction accuracy, especially on data sets that have larger a label space. In addition, AnnexML improves the trade-off between prediction time and accuracy. At the same level of accuracy, the prediction time of AnnexML was up to 58 times faster than that of SLEEC, which is a state-of-the-art embedding-based method.
机译:极端的多标签分类方法已广泛应用于Web级分类任务,例如网页标记和产品推荐。在本文中,我们介绍了一种名为“AnnexML”的新型植物嵌入方法。在训练步骤中,Annexml构造了标签向量的K-Collect邻图,并尝试重现嵌入空间中的图形结构。通过使用近似最近的邻居搜索方法有效地执行预测,其有效地探索嵌入空间中的学习k最近邻图。我们对几个大型现实世界数据集进行了评估,并将我们的方法与最近的最先进的方法进行了比较。实验结果表明,我们的附件可以显着提高预测准确性,特别是在具有较大标签空间的数据集上。此外,AnnexML改善了预测时间和准确性之间的权衡。在相同的准确度,附件的预测时间速度快于SELEC的速度快58倍,这是一种基于最先进的嵌入的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号