【24h】

On active learning for data acquisition

机译:主动学习进行数据采集

获取原文
获取原文并翻译 | 示例

摘要

Many applications are characterized by having naturally incomplete data on customers - where data on only some fixed set of local variables is gathered However, having a more complete picture can help build better models. The naive solution to this problem - acquiring complete data for all customers s often impractical due to the costs of doing so. A possible alternative is to acquire complete data for "some" customers and to use this to improve the models built. The data acquisition problem is determining how many, and which, customers to acquire additional data from. In this paper we suggest using active learning based approaches for the data acquisition problem. In particular, we present initial methods for data acquisition and evaluate these methods experimentally on web usage data and UCI datasets. Results show that the methods perform well and indicate that active learning based methods for data acquisition can be a promising area for data mining research.
机译:许多应用程序的特征是客户的数据自然不完整-仅收集关于一组固定的局部变量的数据。但是,拥有更完整的图片可以帮助构建更好的模型。天真的解决方案-由于这样做的成本,为所有客户获取完整的数据通常不切实际。一种可能的替代方法是为“某些”客户获取完整的数据,并使用它来改进构建的模型。数据获取问题是确定要从中获取多少数据以及从哪些客户那里获取附加数据。在本文中,我们建议使用基于主动学习的方法来解决数据采集问题。特别是,我们介绍了用于数据获取的初始方法,并在Web使用数据和UCI数据集上实验性地评估了这些方法。结果表明,这些方法表现良好,并且表明基于主动学习的数据采集方法可以成为数据挖掘研究的有前途的领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号