首页> 外文会议>Advances in information retrieval >Domain Adaptation for Text Categorization by Feature Labeling
【24h】

Domain Adaptation for Text Categorization by Feature Labeling

机译:通过特征标签对文本进行分类的域自适应

获取原文
获取原文并翻译 | 示例

摘要

We present a novel approach to domain adaptation for text categorization, which merely requires that the source domain data are weakly annotated in the form of labeled features. The main advantage of our approach resides in the fact that labeling words is less expensive than labeling documents. We propose two methods, the first of which seeks to minimize the divergence between the distributions of the source domain, which contains labeled features, and the target domain, which contains only unlabeled data. The second method augments the labeled features set in an unsupervised way, via the discovery of a shared latent concept space between source and target. We empirically show that our approach outperforms standard supervised and semi-supervised methods, and obtains results competitive to those reported by state-of-the-art domain adaptation methods, while requiring considerably less supervision.
机译:我们提出了一种用于文本分类的域自适应新方法,该方法仅要求源域数据以标记特征的形式进行弱注释。我们方法的主要优点在于,标记单词比标记文档便宜。我们提出了两种方法,第一种方法试图使包含标记特征的源域和仅包含未标记数据的目标域的分布之间的差异最小。第二种方法是通过发现源和目标之间共享的潜在概念空间,以无监督的方式增强标记的特征集。我们凭经验表明,我们的方法优于标准的监督和半监督方法,并获得了与最新领域适应方法报告的结果相比具有竞争力的结果,而所需监督却少得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号