High-Speed String Searching against Large Dictionaries on the Cell/B.E. Processor

机译：高速串搜索小区/ B.E上的大词典。处理器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Our digital universe is growing, creating exploding amounts of data which need to be searched, filtered and protected. String searching is at the core of the tools we use to curb this explosion, such as search engines, network intrusion detection systems, spam filters, and anti-virus programs. But as communication speed grows, our capability to perform string searching in real-time seems to fall behind. Multi-core architectures promise enough computational power to cope with the incoming challenge, but it is still unclear which algorithms and programming models to use to unleash this power. We have parallelized a popular string searching algorithm, Aho-Corasick, on the IBM Cell/B.E. processor, with the goal of performing exact string matching against large dictionaries. In this article we propose a novel approach to fully exploit the DMA-based communication mechanisms of the Cell/B.E. to provide an unprecedented level of aggregate performance with irregular access patterns. We have discovered that memory congestion plays a crucial role in determining the performance of this algorithm. We discuss three aspects of congestion: memory pressure, layout issues and hot spots, and we present a collection of algorithmic solutions to alleviate these problems and achieve quasi-optimal performance. The implementation of our algorithm provides a worst-case throughput of 2.5 Gbps, and a typical throughput between 3.3 and 4.4 Gbps, measured on realistic scenarios with a two-processor Cell/B.E. system.

机译：我们的数字宇宙正在增长，创建需要搜索，过滤和保护的爆炸数量。字符串搜索是在我们用来抑制该爆炸的工具的核心，例如搜索引擎，网络入侵检测系统，垃圾邮件过滤器和防病毒程序。但随着通信速度的增长，我们能够在实时搜索的字符串搜索的能力似乎落后。多核架构承诺足够的计算能力来应对传入的挑战，但仍然不清楚哪些算法和用于释放这种功率的编程模型。我们在IBM Cell / B.E上并行化了流行的String搜索算法Aho-Corasick。处理器，目的是执行针对大词典的精确字符串。在本文中，我们提出了一种新颖的方法来充分利用Cell / B.E的基于DMA的通信机制。提供不规则的访问模式的前所未有的聚合性能水平。我们已经发现，在确定该算法的性能方面，记忆拥塞起着至关重要的作用。我们讨论了拥塞的三个方面：记忆力，布局问题和热点，我们展示了一系列算法解决方案，以减轻这些问题并实现准优越性能。我们的算法的实现提供了2.5 Gbps的最坏情况吞吐量，以及3.3和4.4 Gbps之间的典型吞吐量，以双处理器单元/ B.E为现实的情景测量。系统。

著录项

来源
《International Symposium on Parallel Distributed Processing》|2008年||共12页
会议地点
作者
Daniele Paolo Scarpazza; Oreste Villa; Fabrizio Petrini;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.138-53;
关键词

相似文献

外文文献
中文文献
专利

1. Multi-Stride String Searching for High-Speed Content Inspection [J] . Derek Pao, Xing Wang The Computer journal . 2012,第10期

机译：多步字符串搜索，用于高速内容检查
2. Top-Performance Tokenization and Small-Ruleset Regular Expression Matching: A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor [J] . Daniele Paolo Scarpazza International journal of parallel programming . 2011,第1期

机译：高性能标记化和小规则集正则表达式匹配：Cell / B.E。的定量性能分析和优化研究。处理器
3. Top-Performance Tokenization and Small-Ruleset Regular Expression Matching A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor [J] . Daniele Paolo Scarpazza International Journal of Parallel Programming . 2011,第1期

机译：高性能标记化和小规则集正则表达式匹配Cell / B.E。的量化性能分析和优化研究。处理器
4. High-Speed String Searching against Large Dictionaries on the Cell/B.E. Processor [C] . Daniele Paolo Scarpazza, Oreste Villa, Fabrizio Petrini International Symposium on Parallel Distributed Processing . 2008

机译：高速串搜索小区/ B.E上的大词典。处理器
5. Dictionary Learning and Multidimensional Processing for Tensor Data [D] . Shakeri, Zahra. 2019

机译：张量数据的字典学习和多维处理
6. Online Searching: A Dictionary Bibliographic Guide [O] . Richard Dalton 1984

机译：在线搜索：词典和书目指南
7. High-Speed String Searching against Large Dictionaries on the Cell/B.E. Processor [O] . Daniele Paolo Scarpazza, Oreste Villa 2008

机译：在Cell / B.E。上针对大型词典进行高速字符串搜索处理器

High-Speed String Searching against Large Dictionaries on the Cell/B.E. Processor

摘要

著录项

相似文献

相关主题

期刊订阅