首页> 外文期刊>Pattern Analysis and Machine Intelligence, IEEE Transactions on >Discovering Low-Rank Shared Concept Space for Adapting Text Mining Models
【24h】

Discovering Low-Rank Shared Concept Space for Adapting Text Mining Models

机译:发现低秩共享概念空间以适应文本挖掘模型

获取原文
获取原文并翻译 | 示例

摘要

We propose a framework for adapting text mining models that discovers low-rank shared concept space. Our major characteristic of this concept space is that it explicitly minimizes the distribution gap between the source domain with sufficient labeled data and the target domain with only unlabeled data, while at the same time it minimizes the empirical loss on the labeled data in the source domain. Our method is capable of conducting the domain adaptation task both in the original feature space as well as in the transformed Reproducing Kernel Hilbert Space (RKHS) using kernel tricks. Theoretical analysis guarantees that the error of our adaptation model can be bounded with respect to the embedded distribution gap and the empirical loss in the source domain. We have conducted extensive experiments on two common text mining problems, namely, document classification and information extraction, to demonstrate the efficacy of our proposed framework.
机译:我们提出了一种框架,用于发现低等级共享概念空间的文本挖掘模型的适应。这个概念空间的主要特征是,它显着地最小化了具有足够标记数据的源域与仅具有未标记数据的目标域之间的分布差距,同时又使源域中的标记数据的经验损失最小化。我们的方法能够使用内核技巧在原始特征空间以及转换后的再现内核希尔伯特空间(RKHS)中执行域自适应任务。理论分析保证了我们的适应模型的误差可以针对嵌入的分布差距和源域中的经验损失进行限制。我们对两个常见的文本挖掘问题(即文档分类和信息提取)进行了广泛的实验,以证明我们提出的框架的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号