首页> 外文会议>International symposium on methodologies for intelligent systems >Hyperbolic Embeddings for Hierarchical Multi-label Classification
【24h】

Hyperbolic Embeddings for Hierarchical Multi-label Classification

机译:用于分层多标签分类的双曲线嵌入

获取原文

摘要

Hierarchical multi-label classification (HMC) is a practically relevant machine learning task with applications ranging from text categorization, image annotation and up to functional genomics. State of the art results for HMC are obtained with ensembles of predictive models, especially ensembles of predictive clustering trees. Predictive clustering trees (PCTs) generalize decision trees towards HMC and can be combined into ensembles using techniques such as bagging and random forests. There are two major issues that influence the performance of HMC methods: (1) the computational bottleneck imposed by the size of the label hierarchy that can easily reach tens of thousands of labels, and (2) the sparsity of annotations in the label/output space. To address these limitations, we propose an approach that combines graph node embeddings and a specific property of PCTs (descriptive, clustering and target attributes can be specified arbitrarily). We adapt Poincare hyperbolic node embeddings to obtain low dimensional label set embeddings, which are then used to guide PCT construction instead of the original label space. This greatly reduces the time needed to construct a tree due to the difference in dimensionality. The input and output space remain the same: the tests in the tree use original attributes, and in the leaves the original labels are predicted directly. We empirically evaluate the proposed approach on 9 datasets. The results show that our approach dramatically reduces the computational cost of learning and can lead to improved predictive performance.
机译:分层多标签分类(HMC)是一个实际相关的机器学习任务,其中包含文本分类,图像注释和功能基因组学的应用程序。通过预测模型的集合,特别是预测聚类树木的集合获得了HMC的最先进的结果。预测聚类树(PCT)将决策树概括为HMC,并且可以使用袋装和随机林等技术组合成集合。有两个主要问题影响了HMC方法的性能:(1)由标签层次大小施加的计算瓶颈,可以容易地达到成千上万的标签,(2)标签/输出中的注释的稀疏性空间。为了解决这些限制,我们提出了一种方法,即将图形节点嵌入品和PCT的特定属性组合(描述性,聚类,目标属性可以任意指定)。我们调整Poincare Suptibic节点嵌入物以获得低维标签集嵌入式,然后用于引导PCT施工而不是原始标签空间。这大大减少了由于维度差异构建树所需的时间。输入和输出空间保持不变:树中的测试使用原始属性,并且在叶子中直接预测原始标签。我们经验在9个数据集中统一评估所提出的方法。结果表明,我们的方法大大降低了学习的计算成本,并导致提高预测性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号