【24h】

Learning with Annotation Noise

机译:用注释噪声学习

获取原文

摘要

It is usually assumed that the kind of noise existing in annotated data is random classification noise. Yet there is evidence that differences between annotators are not always random attention slips but could result from different biases towards the classification categories, at least for the harder-to-decide cases. Under an annotation generation model that takes this into account, there is a hazard that some of the training instances are actually hard cases with unreliable annotations. We show that these are relatively unproblematic for an algorithm operating under the 0-1 loss model, whereas for the commonly used voted perceptron algorithm, hard training cases could result in incorrect prediction on the uncontroversial cases at test time.
机译:通常假定带注释的数据中存在的噪声类型是随机分类噪声。然而,有证据表明,注释者之间的差异并不总是随机的注意单,而是可能由于对分类类别的不同偏见而产生,至少对于较难决定的案例而言。在考虑到这一点的注释生成模型下,存在这样的危险,即某些训练实例实际上是带有不可靠注释的困难案例。我们表明,对于在0-1损失模型下运行的算法而言,这些问题相对没有问题,而对于常用的表决感知器算法,严格的训练案例可能会在测试时导致对无争议案例的错误预测。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号