【24h】

Ballpark Learning: Estimating Labels from Rough Group Comparisons

机译:球场学习:通过粗糙组比较来估计标签

获取原文

摘要

We are interested in estimating individual labels given only coarse, aggregated signal over the data points. In our setting, we receive sets ("bags") of unlabeled instances with constraints on label proportions. We relax the unrealistic assumption of known label proportions, made in previous work; instead, we assume only to have upper and lower bounds, and constraints on bag differences. We motivate the problem, propose an intuitive formulation and algorithm, and apply our methods to real-world scenarios. Across several domains, we show how using only proportion constraints and no labeled examples, we can achieve surprisingly high accuracy. In particular, we demonstrate how to predict income level using rough stereotypes and how to perform sentiment analysis using very little information. We also apply our method to guide exploratory analysis, recovering geographical differences in twitter dialect.
机译:我们感兴趣的是估计仅在数据点上给出粗略汇总信号的单个标签。在我们的环境中,我们收到未标注实例的集合(“袋”),这些实例受到标注比例的限制。我们放宽了先前工作中对已知标签比例的不切实际假设;相反,我们假设只具有上限和下限,以及对包装袋差异的限制。我们会激发问题,提出直观的公式和算法,并将我们的方法应用于实际场景。在多个领域中,我们展示了仅使用比例约束而不使用带标签的示例,我们如何可以实现令人惊讶的高精度。特别是,我们演示了如何使用粗糙的刻板印象来预测收入水平,以及如何使用很少的信息来进行情绪分析。我们还将应用我们的方法来指导探索性分析,恢复Twitter方言中的地域差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号