A Community Detection Approach to Cleaning Extremely Large Face Database

Chi Jin; Ruochun Jin; Kai Chen; Yong Dou

首页> 外文期刊>Computational intelligence and neuroscience >A Community Detection Approach to Cleaning Extremely Large Face Database

【24h】

A Community Detection Approach to Cleaning Extremely Large Face Database

机译：清理超大型人脸数据库的社区检测方法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Though it has been easier to build large face datasets by collecting images from the Internet in this Big Data era, the time-consuming manual annotation process prevents researchers from constructing larger ones, which makes the automatic cleaning of noisy labels highly desirable. However, identifying mislabeled faces by machine is quite challenging because the diversity of a person’s face images that are captured wildly at all ages is extraordinarily rich. In view of this, we propose a graph-based cleaning method that mainly employs the community detection algorithm and deep CNN models to delete mislabeled images. As the diversity of faces is preserved in multiple large communities, our cleaning results have both high cleanness and rich data diversity. With our method, we clean the extremely large MS-Celeb-1M face dataset (approximately 10 million images with noisy labels) and obtain a clean version of it called C-MS-Celeb (6,464,018 images of 94,682 celebrities). By training a single-net model using our C-MS-Celeb dataset, without fine-tuning, we achieve 99.67% at Equal Error Rate on the LFW face recognition benchmark, which is comparable to other state-of-the-art results. This demonstrates the data cleaning positive effects on the model training. To the best of our knowledge, our C-MS-Celeb is the largest clean face dataset that is publicly available so far, which will benefit face recognition researchers.

机译：尽管在此大数据时代通过从Internet收集图像来构建大脸部数据集比较容易，但是耗时的手动注释过程阻止研究人员构建更大的脸部数据集，这使自动清洁嘈杂的标签变得非常可取。但是，通过机器识别贴错标签的面孔非常具有挑战性，因为在各个年龄段都疯狂捕获的人脸图像的多样性非常丰富。有鉴于此，我们提出了一种基于图的清洗方法，该方法主要采用社区检测算法和深度CNN模型来删除标签错误的图像。由于在多个大型社区中都保留了面孔的多样性，因此我们的清洁结果既具有高度清洁性，又具有丰富的数据多样性。使用我们的方法，我们清理了非常大的MS-Celeb-1M人脸数据集（带有噪点标签的大约1000万张图像），并获得了称为C-MS-Celeb的干净版本（94,682位名人的6,464,018张图像）。通过使用我们的C-MS-Celeb数据集训练单网模型，而无需进行微调，我们在LFW人脸识别基准上的平均错误率达到了99.67％，这可以与其他最新结果相媲美。这证明了数据清理对模型训练的积极作用。据我们所知，我们的C-MS-Celeb是迄今为止公开提供的最大的面部清洁数据集，这将使面部识别研究人员受益。

著录项

来源
《Computational intelligence and neuroscience》 |2018年第3期|共页
作者
Chi Jin; Ruochun Jin; Kai Chen; Yong Dou;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类寄生生物学;
关键词

相似文献

外文文献
中文文献
专利

1. A Community Detection Approach to Cleaning Extremely Large Face Database [J] . Chi Jin, Ruochun Jin, Kai Chen, Computational intelligence and neuroscience . 2018,第Pta1期

机译：一种清洁极大的面部数据库的社区检测方法
2. FIELD SURVEY OF CANADIAN BACKGROUND SOILS: IMPLICATIONS FOR A NEW MATHEMATICAL GAS CHROMATOGRAPHY-FLAME IONIZATION DETECTION APPROACH FOR RESOLVING FALSE DETECTIONS OF PETROLEUM HYDROCARBONS IN CLEAN SOILS [J] . Francine Kelly-Hooperz, Andrea J. Farwell, Glenna Pike, Environmental toxicology and chemistry . 2014,第8期

机译：加拿大背景土壤的现场调查：对解决清洁土壤中石油烃错误检测的新型气相色谱-火焰电离检测方法的意义
3. BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm [J] . Shafaq Abbas, Zunera Jalil, Abdul Rehman Javed, PeerJ Computer Science . 2021,第a期

机译：BCD-WERT：使用基于鲸鱼优化的高效特征和极其随机树算法的新型乳腺癌检测方法
4. LeadersRank: Towards a new approach for community detection in social networks Community detection based on leaders' nodes [C] . Sara AHAJJAM, Mohamed EL HADDAD, Hassan BADIR IEEE/ACS International Conference on Computer Systems and Applications . 2015

机译：领导者：基于领导人节点的社会网络社区检测中的新方法
5. A Holistic Approach Using Honey Communities For Cyber Event Detection and Protection in Communities and Large Distributed Organizations. [D] . Rutherford, James R. 2017

机译：使用Honey社区的整体方法，用于社区和大型分布式组织中的网络事件检测和保护。
6. A Community Detection Approach to Cleaning Extremely Large Face Database [O] . Chi Jin, Ruochun Jin, Kai Chen, 2018

机译：清理超大型人脸数据库的社区检测方法
7. F2:F3b Ratio and BOC-Adjusted PHC F3 Approach to Resolving False Detections of Crude Oil and Diesel Drilling Waste in Clean Soils and Manure Compost [O] . Kelly-Hooper Francine Teresa 2013

机译：F2：F3b比率和BOC调整的PHC F3方法解决了清洁土壤和肥料堆肥中原油和柴油钻井废料的错误检测
8. Clean Communities on the Move: A Partnership-Driven Approach to Clean Air and Smart Transportation [R] . Ward, M., Flynn, S., Boudouris, K., 2005

机译：移动中的清洁社区：以清洁空气和智能交通为导向的伙伴关系驱动方式

A Community Detection Approach to Cleaning Extremely Large Face Database

摘要

著录项

相似文献

相关主题

期刊订阅