首页> 外文会议>IEEE International Conference on Image Processing >A data-driven approach to cleaning large face datasets
【24h】

A data-driven approach to cleaning large face datasets

机译:一种数据驱动的方法来清洗大型面部数据集

获取原文

摘要

Large face datasets are important for advancing face recognition research, but they are tedious to build, because a lot of work has to go into cleaning the huge amount of raw data. To facilitate this task, we describe an approach to building face datasets that starts with detecting faces in images returned from searches for public figures on the Internet, followed by discarding those not belonging to each queried person. We formulate the problem of identifying the faces to be removed as a quadratic programming problem, which exploits the observations that faces of the same person should look similar, have the same gender, and normally appear at most once per image. Our results show that this method can reliably clean a large dataset, leading to a considerable reduction in the work needed to build it. Finally, we are releasing the FaceScrub dataset that was created using this approach. It consists of 141,130 faces of 695 public figures and can be obtained from http://vintage.winklerbros.net/facescrub.html.
机译:大型面部数据集对于推进面部识别研究很重要,但构建起来却很繁琐,因为清理大量的原始数据需要进行大量工作。为了简化此任务,我们描述了一种构建面部数据集的方法,该方法首先检测Internet搜索公共人物所返回的图像中的面部,然后丢弃不属于每个被查询人的面部。我们将识别要去除的脸部的问题公式化为二次编程问题,该问题利用了以下观察结果:同一个人的脸部看起来应该相似,具有相同的性别,并且每个图像通常最多出现一次。我们的结果表明,该方法可以可靠地清理大型数据集,从而大大减少了构建数据集所需的工作。最后,我们将发布使用此方法创建的FaceScrub数据集。它由695位公众人物的141,130张面孔组成,可以从http://vintage.winklerbros.net/facescrub.html获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号