A Community Detection Approach to Cleaning Extremely Large Face Database

Chi Jin; Ruochun Jin; Kai Chen; Yong Dou; Amparo Alonso-Betanzos

首页> 外文期刊>Computational intelligence and neuroscience >A Community Detection Approach to Cleaning Extremely Large Face Database

【24h】

A Community Detection Approach to Cleaning Extremely Large Face Database

机译：一种清洁极大的面部数据库的社区检测方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Though it has been easier to build large face datasets by collecting images from the Internet in this Big Data era, the time-consuming manual annotation process prevents researchers from constructing larger ones, which makes the automatic cleaning of noisy labels highly desirable. However, identifying mislabeled faces by machine is quite challenging because the diversity of a person’s face images that are captured wildly at all ages is extraordinarily rich. In view of this, we propose a graph-based cleaning method that mainly employs the community detection algorithm and deep CNN models to delete mislabeled images. As the diversity of faces is preserved in multiple large communities, our cleaning results have both high cleanness and rich data diversity. With our method, we clean the extremely large MS-Celeb-1M face dataset (approximately 10 million images with noisy labels) and obtain a clean version of it called C-MS-Celeb (6,464,018 images of 94,682 celebrities). By training a single-net model using our C-MS-Celeb dataset, without fine-tuning, we achieve 99.67% at Equal Error Rate on the LFW face recognition benchmark, which is comparable to other state-of-the-art results. This demonstrates the data cleaning positive effects on the model training. To the best of our knowledge, our C-MS-Celeb is the largest clean face dataset that is publicly available so far, which will benefit face recognition researchers.

机译：虽然通过从互联网中收集来自互联网的图像更容易构建大面对数据集，但耗时的手动注释过程可防止研究人员构建更大的，这使得高度清洁噪声标签的自动清洁。然而，通过机器识别错误标记的面是非常具有挑战性的，因为一个人在所有年龄段疯狂捕获的人的脸部图像的多样性非常丰富。鉴于此，我们提出了一种基于图形的清洁方法，主要采用社区检测算法和深度CNN模型来删除错误标记的图像。随着面孔的多样性被保存在多个大型社区中，我们的清洁结果具有高清洁性和丰富的数据分集。通过我们的方法，我们清洁极大的MS-CeleB-1M面部数据集（大约1000万个具有嘈杂标签的图像），并获得一个名为C-MS-Celeb的清洁版本（6,464,018个名人的6,464,018张图片）。通过使用我们的C-MS-Celeb数据集培训单网模型，无需微调，我们在LFW面部识别基准测试中以相同的错误率达到99.67％，这与其他最先进的结果相当。这证明了数据清理模型培训的积极影响。据我们所知，我们的C-MS-CELEB是迄今为止公开可用的最大清洁面部数据集，这将使人脸识别研究人员受益。

著录项

来源
《Computational intelligence and neuroscience》 |2018年第1期|共10页
作者
Chi Jin; Ruochun Jin; Kai Chen; Yong Dou; Amparo Alonso-Betanzos;
展开▼
作者单位

Computer School;

National Laboratory for Parallel and Distributed Processing;

National Laboratory for Parallel and Distributed Processing;

National Laboratory for Parallel and Distributed Processing;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类寄生生物学;
关键词

相似文献

外文文献
中文文献
专利

1. A Community Detection Approach to Cleaning Extremely Large Face Database [J] . Chi Jin, Ruochun Jin, Kai Chen, Computational intelligence and neuroscience . 2018,第3期

机译：清理超大型人脸数据库的社区检测方法
2. FIELD SURVEY OF CANADIAN BACKGROUND SOILS: IMPLICATIONS FOR A NEW MATHEMATICAL GAS CHROMATOGRAPHY-FLAME IONIZATION DETECTION APPROACH FOR RESOLVING FALSE DETECTIONS OF PETROLEUM HYDROCARBONS IN CLEAN SOILS [J] . Francine Kelly-Hooperz, Andrea J. Farwell, Glenna Pike, Environmental toxicology and chemistry . 2014,第8期

机译：加拿大背景土壤的现场调查：对解决清洁土壤中石油烃错误检测的新型气相色谱-火焰电离检测方法的意义
3. BCD-WERT: a novel approach for breast cancer detection using whale optimization based efficient features and extremely randomized tree algorithm [J] . Shafaq Abbas, Zunera Jalil, Abdul Rehman Javed, PeerJ Computer Science . 2021,第a期

机译：BCD-WERT：使用基于鲸鱼优化的高效特征和极其随机树算法的新型乳腺癌检测方法
4. LeadersRank: Towards a new approach for community detection in social networks Community detection based on leaders' nodes [C] . Sara AHAJJAM, Mohamed EL HADDAD, Hassan BADIR IEEE/ACS International Conference on Computer Systems and Applications . 2015

机译：领导者：基于领导人节点的社会网络社区检测中的新方法
5. A Holistic Approach Using Honey Communities For Cyber Event Detection and Protection in Communities and Large Distributed Organizations. [D] . Rutherford, James R. 2017

机译：使用Honey社区的整体方法，用于社区和大型分布式组织中的网络事件检测和保护。
6. A Community Detection Approach to Cleaning Extremely Large Face Database [O] . Chi Jin, Ruochun Jin, Kai Chen, 2018

机译：清理超大型人脸数据库的社区检测方法
7. F2:F3b Ratio and BOC-Adjusted PHC F3 Approach to Resolving False Detections of Crude Oil and Diesel Drilling Waste in Clean Soils and Manure Compost [O] . Kelly-Hooper Francine Teresa 2013

机译：F2：F3b比率和BOC调整的PHC F3方法解决了清洁土壤和肥料堆肥中原油和柴油钻井废料的错误检测
8. Clean Communities on the Move: A Partnership-Driven Approach to Clean Air and Smart Transportation [R] . Ward, M., Flynn, S., Boudouris, K., 2005

机译：移动中的清洁社区：以清洁空气和智能交通为导向的伙伴关系驱动方式

A Community Detection Approach to Cleaning Extremely Large Face Database

摘要

著录项

相似文献

相关主题

期刊订阅