首页> 外文会议>Pacific-Asia conference on knowledge discovery and data mining >Uncovering the Latent Structures of Crowd Labeling
【24h】

Uncovering the Latent Structures of Crowd Labeling

机译:发现人群标签的潜在结构

获取原文

摘要

Crowdsourcing provides a new way to distribute enormous tasks to a crowd of annotators. The divergent knowledge background and personal preferences of crowd annotators lead to noisy (or even inconsistent) answers to a same question. However, diverse labels provide us information about the underlying structures of tasks and annotators. This paper proposes latent-class assumptions for learning-from-crowds models, that is, items can be separated into several latent classes and workers' annotating behaviors may differ among different classes. We propose a nonparametric model to uncover the latent classes, and also extend the state-of-the-art minimax entropy estimator to learn latent structures. Experimental results on both synthetic data and real data collected from Amazon Mechanical Turk demonstrate our methods can disclose interesting and meaningful latent structures, and incorporating latent class structures can also bring significant improvements on ground truth label recovery for difficult tasks.
机译:众包提供了一种将大量任务分配给大量注释者的新方法。人群注释者的知识背景和个人偏好的差异导致对同一问题的嘈杂(甚至不一致)的答案。但是,各种各样的标签为我们提供了有关任务和注释器的基础结构的信息。本文提出了从人群中学习模型的潜在类假设,即,项目可以分为几个潜在类,并且工人的注释行为在不同类之间可能有所不同。我们提出了一个非参数模型来揭示潜在类,并且还扩展了最新的最小极大熵估计量来学习潜在结构。从Amazon Mechanical Turk收集的合成数据和真实数据的实验结果表明,我们的方法可以揭示有趣且有意义的潜在结构,而合并潜在类结构也可以为困难任务的地面真相标签恢复带来重大改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号