首页> 外文会议>IEEE Systems and Information Engineering Design Symposium >Using online user behavior to predict demographics
【24h】

Using online user behavior to predict demographics

机译:使用在线用户行为来预测人口统计数据

获取原文

摘要

Videology, an online video advertising company, is often unable to obtain gender information about incoming online advertisement requests. They purchase aggregate gender statistics on groups of requests from a third party. This project explores creating groups of requests in which at least 80% of the advertisement requests have the same gender using: 1) traditional clustering algorithms, 2) iterative linear regression algorithm ??? ITRA ??? and 3) qualitative clustering algorithm (ROCK). In all cases, the data used was either browsing history data or synthetic attributes created by dimensionality reduction to more simply describe that history. These three approaches were unable to consistently create the desired gender discrimination. None of these three approaches proved to be the preferred alternative as the performance of each varied drastically as the test data set, and even subsets of that test data set changed. However, amongst the data sets used, these methods were able in some instances to create small buckets (less than 3,000 requests) with the desired gender distribution. The success or failure of these algorithms was dependent upon how similar individual requests were to one another (i.e. how many attributes were on average shared between requests). The approaches performed better in those instances in which more attributes were shared between requests, i.e., the requests contained information that allowed for the classification of the requests.
机译:Videogy是一个在线视频广告公司,通常无法获得有关传入在线广告请求的性别信息。他们购买来自第三方的请求组总体性别统计数据。该项目探讨创建一组请求组,其中至少80%的广告请求使用:1)传统聚类算法,2)迭代线性回归算法??? ITRA ??? 3)定性聚类算法(Rock)。在所有情况下,所使用的数据是浏览历史数据或由维数减少创建的综合属性,以更简单地描述该历史记录。这三种方法无法始终如一地造成所需的性别歧视。这三种方法都没有被证明是优选的替代方案,因为每个都随着测试数据集而变化的性能,甚至是该测试数据集的子集。但是,在所使用的数据集中,这些方法在某些情况下能够创建具有所需性别分布的小桶(少于3,000个请求)。这些算法的成功或失败取决于与彼此相似的个人请求程度(即,请求之间的平均值是多少个属性)。在请求之间共享更多属性的情况下,在这些情况下执行更好的方法,即请求包含允许对请求分类的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号