首页> 外国专利> ADAPTIVE SAMPLING SCHEME FOR IMBALANCED LARGE SCALE DATA

ADAPTIVE SAMPLING SCHEME FOR IMBALANCED LARGE SCALE DATA

机译:大规模数据不均衡的自适应采样方案

摘要

Embodiments of the present invention relate to providing business customers with predictive capabilities, such as identifying valuable customers or estimating the likelihood that a product will be purchased. An adaptive sampling scheme is utilized, which helps generate sample data points from large scale data that is imbalanced (for example, digital website traffic with hundreds of millions of visitors but only a small portion of them are of interest). In embodiments, a stream of sample data points is received. Positive samples are added to a positive list until the desired number of positives is reached and negative samples are added to a negative list until the desired number of negative samples is reached. The positive list and the negative list can then be combined, shuffled, and fed into a prediction model.
机译:本发明的实施例涉及向商业客户提供预测能力,例如识别有价值的客户或估计将购买产品的可能性。利用了自适应采样方案,该方案可帮助从不平衡的大规模数据中生成样本数据点(例如,具有数亿访问者的数字网站流量,但其中只有一小部分受到关注)。在实施例中,接收样本数据点的流。将正样本添加到正列表中,直到达到所需的正数;将负样本添加到负列表中,直到达到所需的负数。然后可以将肯定列表和否定列表进行组合,重新组合,然后输入到预测模型中。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号