首页> 外文会议>IEEE International Conference on Data Mining Workshops >Multi-Classes Feature Engineering with Sliding Window for Purchase Prediction in Mobile Commerce
【24h】

Multi-Classes Feature Engineering with Sliding Window for Purchase Prediction in Mobile Commerce

机译:带有滑动窗口的多类特征工程用于移动商务中的购买预测

获取原文

摘要

Mobile devices become more and more prevalent in recent years, especially in young groups. The rapid progress of mobile devices promotes the development of M-Commerce business. The purchase on mobile terminals accounts for a considerable percentage in the total trading volume of E-Commerce and begins to draw the attention of E-Commerce corporation. Alibaba held a Mobile Recommendation Algorithm Competition aiming to recommend appropriate items for mobile users at the right time and place. The dataset provided by Alibaba consists of about 6 billion operation logs made by 5 million Taobao users towards over 150 million items spanning a period of one month. Compared with traditional scenarios in purchase predicting, the competition raised three challenges: (1)The dataset is too large to be processed in personal computers, (2)Some days with great discounts provided by Taobao Marketplace are within the period of dataset, (3)Positive samples are too few compared to the dimension of features. In this paper we study the problem of predicting the purchase behaviour of M-Commerce users, by exploring the solution for Alibaba's Mobile Recommendation Algorithm Competition. We first deeply study the habit of customers and filter many outliers. After that we adopt the method of "sliding window" to supply positive samples of training dataset and smooth the burst of sales near Dec 12th. We design a feature engineering framework to extract 6 categories of features that aim to capture the buying potential of user-item pairs. Our features exploit the interaction of user-item pair, user's shopping habit and item' attraction for users. Then we apply Gradient Boost Decision Trees (GBDT) as the training model. In the end, we combine outputs of individual GBDT together by Logistic Regression to get the final predictions. Our solution achieves 8.66% F1 score, and ranks the third place in the final round.
机译:近年来,移动设备变得越来越普遍,尤其是在年轻人群体中。移动设备的快速发展促进了M-Commerce业务的发展。在移动终端上的购买占电子商务总交易量的很大一部分,并开始引起电子商务公司的注意。阿里巴巴举办了“移动推荐算法竞赛”,旨在在适当的时间和地点为移动用户推荐合适的商品。阿里巴巴提供的数据集由500万淘宝用户制作的大约60亿条操作日志组成,涉及一个月的1.5亿个项目。与购买预测中的传统场景相比,该竞争提出了三个挑战:(1)数据集太大,无法在个人计算机上处​​理;(2)淘宝商城提供的折扣很大的日子在该数据集的时期内,(3与特征尺寸相比,正样本太少了。本文通过探索阿里巴巴移动推荐算法竞赛的解决方案,研究了预测M-Commerce用户购买行为的问题。我们首先深入研究客户的习惯,并过滤掉许多异常值。之后,我们采用“滑动窗口”方法来提供训练数据集的正样本,并在12月12日前后平滑销售量。我们设计了一个功能工程框架,以提取6类功能,旨在捕捉用户项目对的购买潜力。我们的功能利用用户项目对,用户的购物习惯和项目对用户的吸引力的交互作用。然后,我们将梯度提升决策树(GBDT)用作训练模型。最后,我们通过Logistic回归将单个GBDT的输出组合在一起,以获得最终预测。我们的解决方案达到了F1分数的8.66%,并在最后一轮中排名第三。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号