首页> 外文会议>Annual conference on Neural Information Processing Systems >Robust Bloom Filters for Large Multilabel Classification Tasks
【24h】

Robust Bloom Filters for Large Multilabel Classification Tasks

机译:适用于大型多限制任务的强大绽放过滤器

获取原文

摘要

This paper presents an approach to multilabel classification (MLC) with a large number of labels. Our approach is a reduction to binary classification in which label sets are represented by low dimensional binary vectors. This representation follows the principle of Bloom filters, a space-efficient data structure originally designed for approximate membership testing. We show that a naive application of Bloom filters in MLC is not robust to individual binary classifiers' errors. We then present an approach that exploits a specific feature of real-world datasets when the number of labels is large: many labels (almost) never appear together. Our approach is provably robust, has sublinear training and inference complexity with respect to the number of labels, and compares favorably to state-of-the-art algorithms on two large scale multilabel datasets.
机译:本文介绍了具有大量标签的多书分类(MLC)的方法。我们的方法是减少到二进制分类,其中标签集由低维二进制向量表示。此表示遵循盛开过滤器的原理,最初设计用于占隶属测试的空间高效的数据结构。我们表明MLC中的绽放过滤器的天真应用对单个二进制分类器的错误并不稳健。然后,当标签数量大:许多标签(几乎)从未出现在一起时,我们将利用现实世界数据集的特定特征的方法。我们的方法是可怕的,具有级数培训和推理复杂性的标签数量,并对两个大规模多标签数据集上的最先进的算法进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号