首页> 外文会议>ACM SIGKDD international conference on knowledge discovery and data mining;KDD 10 >Unsupervised Transfer Classification: Application to Text Categorization
【24h】

Unsupervised Transfer Classification: Application to Text Categorization

机译:无监督传输分类:应用于文本分类

获取原文

摘要

We study the problem of building the classification model for a target class in the absence of any labeled training example for that class. To address this difficult learning problem, we extend the idea of transfer learning by assuming that the following side information is available: (i) a collection of labeled examples belonging to other classes in the problem domain, called the auxiliary classes; (ii) the class information including the prior of the target class and the correlation between the target class and the auxiliary classes. Our goal is to construct the classification model for the target class by leveraging the above data and information. We refer to this learning problem as unsupervised transfer classification. Our framework is based on the generalized maximum entropy model that is effective in transferring the label information of the auxiliary classes to the target class. A theoretical analysis shows that under certain assumption, the classification model obtained by the proposed approach converges to the optimal model when it is learned from the labeled examples for the target class. Empirical study on text categorization over four different data sets verifies the effectiveness of the proposed approach.
机译:我们研究了在没有针对该类别的任何标记训练示例的情况下为目标类别建立分类模型的问题。为了解决这个困难的学习问题,我们通过假设可获得以下辅助信息来扩展迁移学习的思想:(i)属于问题域中其他类别(称为辅助类别)的带标签示例的集合; (ii)分类信息,包括目标分类的先验以及目标分类与辅助分类之间的相关性。我们的目标是通过利用以上数据和信息来构建目标类别的分类模型。我们将此学习问题称为无监督转移分类。我们的框架基于广义最大熵模型,该模型可有效地将辅助类的标签信息传递到目标类。理论分析表明,在一定的假设下,当从目标类别的标记示例中获悉时,通过该方法获得的分类模型将收敛到最优模型。对四种不同数据集进行文本分类的实证研究证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号