首页> 外文会议>International Symposium on Parallel Distributed Processing >High-Speed String Searching against Large Dictionaries on the Cell/B.E. Processor
【24h】

High-Speed String Searching against Large Dictionaries on the Cell/B.E. Processor

机译:高速串搜索小区/ B.E上的大词典。处理器

获取原文

摘要

Our digital universe is growing, creating exploding amounts of data which need to be searched, filtered and protected. String searching is at the core of the tools we use to curb this explosion, such as search engines, network intrusion detection systems, spam filters, and anti-virus programs. But as communication speed grows, our capability to perform string searching in real-time seems to fall behind. Multi-core architectures promise enough computational power to cope with the incoming challenge, but it is still unclear which algorithms and programming models to use to unleash this power. We have parallelized a popular string searching algorithm, Aho-Corasick, on the IBM Cell/B.E. processor, with the goal of performing exact string matching against large dictionaries. In this article we propose a novel approach to fully exploit the DMA-based communication mechanisms of the Cell/B.E. to provide an unprecedented level of aggregate performance with irregular access patterns. We have discovered that memory congestion plays a crucial role in determining the performance of this algorithm. We discuss three aspects of congestion: memory pressure, layout issues and hot spots, and we present a collection of algorithmic solutions to alleviate these problems and achieve quasi-optimal performance. The implementation of our algorithm provides a worst-case throughput of 2.5 Gbps, and a typical throughput between 3.3 and 4.4 Gbps, measured on realistic scenarios with a two-processor Cell/B.E. system.
机译:我们的数字宇宙正在增长,创建需要搜索,过滤和保护的爆炸数量。字符串搜索是在我们用来抑制该爆炸的工具的核心,例如搜索引擎,网络入侵检测系统,垃圾邮件过滤器和防病毒程序。但随着通信速度的增长,我们能够在实时搜索的字符串搜索的能力似乎落后。多核架构承诺足够的计算能力来应对传入的挑战,但仍然不清楚哪些算法和用于释放这种功率的编程模型。我们在IBM Cell / B.E上并行化了流行的String搜索算法Aho-Corasick。处理器,目的是执行针对大词典的精确字符串。在本文中,我们提出了一种新颖的方法来充分利用Cell / B.E的基于DMA的通信机制。提供不规则的访问模式的前所未有的聚合性能水平。我们已经发现,在确定该算法的性能方面,记忆拥塞起着至关重要的作用。我们讨论了拥塞的三个方面:记忆力,布局问题和热点,我们展示了一系列算法解决方案,以减轻这些问题并实现准优越性能。我们的算法的实现提供了2.5 Gbps的最坏情况吞吐量,以及3.3和4.4 Gbps之间的典型吞吐量,以双处理器单元/ B.E为现实的情景测量。系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号