首页> 外文会议>Advances in data mining : Applications and theoretical aspects >Fast Data Acquisition in Cost-Sensitive Learning
【24h】

Fast Data Acquisition in Cost-Sensitive Learning

机译:成本敏感型学习中的快速数据采集

获取原文
获取原文并翻译 | 示例

摘要

Data acquisition is the first and one of the most important steps in many data mining applications. It is a time consuming and costly task. Acquiring an insufficient number of examples makes the learned model and future prediction inaccurate, while acquiring more examples than necessary wastes time and money. Thus it is very important to estimate the number examples needed for learning algorithms in machine learning. However, most previous learning algorithms learn from a given and fixed set of examples. To our knowledge, little previous work in machine learning can dynamically acquire examples as it learns, and decide the ideal number of examples needed. In this paper, we propose a simple on-line framework for fast data acquisition (FDA). FDA is an extrapolation method that estimates the number of examples needed in each acquisition and acquire them simultaneously. Comparing to the naive step-by-step data acquisition strategy, FDA reduces significantly the number of times of data acquisition and model building. This would significantly reduce the total cost of misclassification, data acquisition arrangement, computation, and examples acquired costs.
机译:在许多数据挖掘应用程序中,数据采集是第一步,也是最重要的步骤之一。这是一项耗时且昂贵的任务。获取足够数量的示例会使学习的模型和未来的预测变得不准确,而获取更多示例所需的时间和金钱却不多。因此,估计机器学习中学习算法所需的数量示例非常重要。但是,大多数以前的学习算法都是从一组给定的固定示例中学习。据我们所知,机器学习的先前工作很少可以在学习过程中动态获取示例,并确定所需的理想示例数量。在本文中,我们提出了一个简单的在线框架以进行快速数据采集(FDA)。 FDA是一种外推方法,它估计每次收购所需的示例数量并同时进行获取。与幼稚的逐步数据获取策略相比,FDA显着减少了数据获取和模型构建的次数。这将大大减少分类错误,数据获取安排,计算和示例获取成本的总成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号