首页> 外文学位 >Modern Models for Learning Large-Scale Highly Skewed Online Advertising Data.
【24h】

Modern Models for Learning Large-Scale Highly Skewed Online Advertising Data.

机译:学习大规模高度倾斜的在线广告数据的现代模型。

获取原文
获取原文并翻译 | 示例

摘要

Click through rate (CTR) and conversation rate estimation are two core prediction tasks in online advertising. However, four major challenges emerged as data scientists trying to analyze the advertising data - sheer volume, the amount of data available for mining is massive; complex structure, there is no easy way to tell what factors drive a user to click an ad or make a conversion and how the factors interacted with one another; high cardinality for categorical variables, features like device id usually have tons of possible values which will lead to very sparse data; severe skewness in response variable with the majority of the users not clicking the ad. In this paper, I will make a comprehensive summary of the state-of-art machine learning models (decision tree based, regularized logistic regression, online learning, and factorization machine) that are often used in the industry to solve the problem. Insights and practical tricks are then provided based on a wide range of experiments conducted on multiple data sets with different characteristics.
机译:点击率(CTR)和会话率估算是在线广告中的两个核心预测任务。然而,当数据科学家试图分析广告数据时,出现了四个主要挑战-数量庞大,可用于挖掘的数据量巨大;结构复杂,无法轻松分辨出哪些因素会促使用户点击广告或进行转化,以及这些因素之间是如何相互作用的;分类变量的基数很高,设备ID之类的功能通常具有大量可能的值,这将导致非常稀疏的数据;响应变量严重偏斜,大多数用户没有点击广告。在本文中,我将对行业中经常用于解决问题的最新机器学习模型(基于决策树,正则逻辑回归,在线学习和分解机)进行全面总结。然后,基于对具有不同特征的多个数据集进行的广泛实验,提供了见解和实用技巧。

著录项

  • 作者

    Zhang, Qiang.;

  • 作者单位

    University of California, Los Angeles.;

  • 授予单位 University of California, Los Angeles.;
  • 学科 Statistics.;Marketing.
  • 学位 M.S.
  • 年度 2015
  • 页码 39 p.
  • 总页数 39
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号