首页> 外文会议>International workshop on big data benchmarking >Benchmarking Fast-Data Platforms for the Aadhaar Biometric Database
【24h】

Benchmarking Fast-Data Platforms for the Aadhaar Biometric Database

机译:基准用于Aadhaar生物识别数据库的快速数据平台

获取原文

摘要

Aadhaar is the world's largest biometric database with a billion records, being compiled as an identity platform to deliver social services to residents of India. Aadhaar processes streams of biometric data as residents are enrolled and updated. Besides ~1 million enrollments and updates per day, up to 100 million daily biometric authentications are expected during delivery of various public services. These form critical Big Data applications, with large volumes and high velocity of data. Here, we propose a stream processing workload, based on the Aadhaar enrollment and Authentication applications, as a Big Data benchmark for distributed stream processing systems. We describe the application composition, and characterize their task latencies and selectivity, and data rate and size distributions, based on real observations. We also validate this benchmark on Apache Storm using synthetic streams and simulated application logic. This paper offers a unique glimpse into an operational national identity infrastructure, and proposes a benchmark for "fast data" platforms to support such eGovernance applications.
机译:Aadhaar是世界上最大的生物识别数据库,汇编为一个身份平台,为印度居民提供社会服务。 Aadhaar作为居民注册和更新的生物识别数据流。除了〜100万次入学和每天更新,在交付各种公共服务期间,预计每日高达1亿日的生物识别验证。这些形成了关键的大数据应用,具有大的卷和数据的高速度。这里,我们基于Aadhaar注册和认证应用程序提出流处理工作负载,作为分布式流处理系统的大数据基准。我们根据真实观察描述应用程序组合,并表征其任务延迟和选择性以及数据速率和大小分布。我们还使用合成流和模拟应用程序逻辑对Apache Storm进行验证此基准。本文提供了独特的一瞥,进入运营国家身份基础架构,并为“快数据”平台提出了支持,以支持此类Egovernance应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号