首页> 外文会议>22nd International Conference on Computational Linguistics >Active Learning with Sampling by Uncertainty and Density for Word Sense Disambiguation and Text Classification
【24h】

Active Learning with Sampling by Uncertainty and Density for Word Sense Disambiguation and Text Classification

机译:主动学习与不确定性和密度抽样的词义消歧和文本分类

获取原文
获取原文并翻译 | 示例

摘要

This paper addresses two issues of active learning. Firstly, to solve a problem of uncertainty sampling that it often fails by selecting outliers, this paper presents a new selective sampling technique, sampling by uncertainty and density (SUD), in which a k-Nearest-Neighbor-based density measure is adopted to determine whether an unlabeled example is an outlier. Secondly, a technique of sampling by clustering (SBC) is applied to build a representative initial training data set for active learning. Finally, we implement a new algorithm of active learning with SUD and SBC techniques. The experimental results from three real-world data sets show that our method outperforms competing methods, particularly at the early stages of active learning.
机译:本文讨论了主动学习的两个问题。首先,通过选择离群值来解决不确定性抽样经常失败的问题,提出了一种新的选择性抽样技术,即基于不确定度和密度的抽样(SUD),其中采用了基于k最近邻的密度度量。确定未标记的示例是否是异常值。其次,采用聚类抽样(SBC)技术来建立代表性的初始训练数据集以进行主动学习。最后,我们使用SUD和SBC技术实现了一种主动学习的新算法。来自三个实际数据集的实验结果表明,我们的方法优于竞争方法,尤其是在主动学习的早期阶段。

著录项

  • 来源
  • 会议地点 Manchester(GB);Manchester(GB)
  • 作者单位

    Natural Language Processing Laboratory Northeastern University Shenyang, Liaoning, P.R.China 110004;

    Natural Language Processing Laboratory Northeastern University Shenyang, Liaoning, P.R.China 110004;

    Natural Language Processing Laboratory Northeastern University Shenyang, Liaoning, P.R.China 110004;

    Language Information Sciences Research Centre City University of Hong Kong HK, P.R.China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 程序设计、软件工程;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号