首页> 外文期刊>International Journal of Parallel Programming >Top-Performance Tokenization and Small-Ruleset Regular Expression Matching A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor
【24h】

Top-Performance Tokenization and Small-Ruleset Regular Expression Matching A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor

机译:高性能标记化和小规则集正则表达式匹配Cell / B.E。的量化性能分析和优化研究。处理器

获取原文
获取原文并翻译 | 示例

摘要

In the last decade, the volume of unstructured data that Internet and enterprise applications create and consume has been growing at impressive rates. The tools we use to process these data are search engines, business analytics suites, natural-language processors and XML processors. These tools rely on tokenization, a form of regular expression matching aimed at extracting words and keywords in a character stream. The further growth of unstructured data-processing paradigms depends critically on the availability of high-performance tokenizers. Despite the impressive amount of parallelism that the multi-core revolution has made available (in terms of multiple threads and wider SIMD units), most applications employ tokenizers that do not exploit this parallelism. I present a technique to design tokenizers that exploit multiple threads and wide SIMD units to process multiple independent streams of data at a high throughput. The technique benefits indefinitely from any future scaling in the number of threads or SIMD width. I show the approach’s viability by presenting a family of tokenizer kernels optimized for the Cell/B.E. processor that deliver a performance seen, so far, only on dedicated hardware. These kernels deliver a peak throughput of 14.30 Gbps per chip, and a typical throughput of 9.76 Gbps on Wikipedia input. Also, they achieve almost-ideal resource utilization (99.2%). The approach is applicable to any SIMD enabled processor and matches well the trend toward wider SIMD units in contemporary architecture design.
机译:在过去的十年中,Internet和企业应用程序创建和使用的非结构化数据量以惊人的速度增长。我们用于处理这些数据的工具是搜索引擎,业务分析套件,自然语言处理器和XML处理器。这些工具依赖于标记化,这是一种正则表达式匹配形式,旨在提取字符流中的单词和关键字。非结构化数据处理范例的进一步增长严重取决于高性能令牌生成器的可用性。尽管多核革命已经提供了令人印象深刻的并行性(就多线程和更宽的SIMD单元而言),但是大多数应用程序都使用不利用这种并行性的令牌生成器。我提出了一种设计标记器的技术,该标记器利用多个线程和宽SIMD单元以高吞吐量处理多个独立的数据流。该技术可从将来线程数量或SIMD宽度的任何扩展中获得无限的收益。通过展示针对Cell / B.E优化的令牌生成器内核系列,我展示了该方法的可行性。迄今为止,仅在专用硬件上才能提供性能的处理器。这些内核的每个芯片的峰值吞吐量为14.30 Gbps,而Wikipedia输入的典型吞吐量为9.76 Gbps。而且,它们实现了几乎理想的资源利用率(99.2%)。该方法适用于任何支持SIMD的处理器,并且与现代架构设计中向更广泛的SIMD单元发展的趋势非常吻合。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号