首页> 外文会议>IEEE International Conference on Data Science and Advanced Analytics >Classifying Sensitive Content in Online Advertisements with Deep Learning
【24h】

Classifying Sensitive Content in Online Advertisements with Deep Learning

机译:使用深度学习对在线广告中的敏感内容进行分类

获取原文

摘要

In online advertising, an important quality control step is to audit advertising images ("creatives") before they appear on publishers' webpages. This ensures that advertisements only appear on webpages where the ad is appropriate. Assigning the correct sensitive categories to each creative - such as alcohol, tobacco, etc. - is one of the most important aspects to get correct. If a sensitive creative is displayed on the wrong webpage, it can ruin the user's experience, the publisher's reputation, and may have legal implications. To protect against this, humans audit every creative before it is displayed through our ad exchange; this process is costly and time consuming. This paper explains how we automated sensitive category detection. To detect whether a creative has any sensitive content, we use a pre-trained deep convolutional neural network (Xception [1]) to process the creative image and merge this with the historical distribution of sensitive categories associated with the creative's landing page (the webpage that loads when the ad is clicked, which may also contain sensitive content). This representation is then passed into a series of fully connected layers to make a prediction of whether a creative belongs to a sensitive category. We show in offline testing that this model achieves slightly better than human performance (model accuracy 99.92%; human accuracy 99.88%) on a large fraction of creatives (61%) while making 3.5 times fewer mistakes in certain categories for which mistakes are especially costly. These results changed somewhat when deploying this model at scale in production, where a small modification resulted in classifying fewer creatives than estimated offline, with approximately the same accuracy (52% classified with 99.87% accuracy).
机译:在在线广告中,重要的质量控制步骤是在广告图像(“创意”)出现在发布者的网页上之前对其进行审核。这样可以确保广告仅出现在适合该广告的网页上。为每个广告素材分配正确的敏感类别(例如酒精,烟草等)是获得正确的最重要方面之一。如果敏感的广告素材显示在错误的网页上,可能会破坏用户的体验,发布者的声誉,并可能产生法律影响。为了防止这种情况,人们会审核每个广告素材,然后再通过我们的广告交易平台进行展示;这个过程既昂贵又费时。本文介绍了我们如何自动进行敏感类别检测。为了检测广告素材是否包含任何敏感内容,我们使用预先训练的深度卷积神经网络(Xception [1])处理广告素材图像,并将其与与广告素材登录页面(网页)相关的敏感类别的历史分布合并点击广告时加载,其中也可能包含敏感内容。然后,将此表示形式传递到一系列完全连接的层中,以预测广告素材是否属于敏感类别。我们在离线测试中显示,该模型在很大一部分广告素材(61%)上的表现要略高于人工表现(模型准确度为99.92 \%;人工准确度为99.88 \%),而在某些类别的错误中造成的错误减少了3.5倍特别昂贵。当在生产中大规模部署此模型时,这些结果有所变化,其中的一点点改动导致分类的广告素材少于离线估算的广告素材,其准确性大致相同(52%的分类为99.87%的准确性)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号