首页> 外文会议>Parallel and Distributed Computing, Applications and Technologies, 2009 >Key Elements Tracing Method for Parallel XML Parsing in Multi-Core System
【24h】

Key Elements Tracing Method for Parallel XML Parsing in Multi-Core System

机译:多核系统中并行XML解析的关键元素跟踪方法

获取原文

摘要

Though XML is applied intensively in a lot of applications, XML parsing is not practical in many fields because of its poor performance. Parallel XML parsing on multi-core is a promising choice. Previous methods all adopt data parallel approach on XML parsing. As the semi-structured nature of XML, they were obliged to divide the data into well-formed XML chunks and then parse these chunks parallel. The division process is named as preparsing. As the preparsing is serial, it becomes the bottleneck of parallel XML parsing. Related work Simultaneous Finite Transducer (SFTXP) parallelized the preparsing stage. It maintained multiple preparser results for each equal sized chunk according to enumerated all possible parsing states. In spite of finite states for each XML, the overhead by SFTXP is tremendous, including CPU time and memory for multiple results generating and storing, respectively. In this work, we address parallel XML parsing by Key Element Parse Tracing (KEPT) method which parallelizes the preparsing and parsing at element level. It remolds the preparsing as a key element extracting process and schedules the processing of key elements in the framework of KEPT. Then parsing process is parallelized as a whole. To demonstrate the effectiveness, we implement it on libxml2 and obtain good scalability on both an 8-core Linux machine and an 8-core 24 SMT Sun machine running Solaris.
机译:尽管XML在许多应用程序中得到了广泛的应用,但是XML解析由于其性能较差而在许多领域并不实用。在多核上并行XML解析是一个有前途的选择。先前的方法都在XML解析中采用数据并行方法。由于XML是半结构化的,因此必须将数据划分为格式良好的XML块,然后并行解析这些块。划分过程称为准备。由于准备工作是串行的,因此成为并行XML解析的瓶颈。相关工作同步有限换能器(SFTXP)使准备阶段并行化。它根据枚举的所有可能的解析状态,为每个相等大小的块维护了多个预解析器结果。尽管每种XML都有有限的状态,但是SFTXP的开销是巨大的,包括CPU时间和用于分别生成和存储多个结果的内存。在这项工作中,我们通过关键元素解析跟踪(KEPT)方法解决了并行XML解析问题,该方法在元素级别并行化了准备和解析过程。它将准备工作重塑为关键元素提取过程,并在KEPT框架中安排对关键元素的处理。然后将解析过程作为一个整体进行并行化。为了证明其有效性,我们在libxml2上实现了它,并在运行Solaris的8核Linux机器和8核24 SMT Sun机器上都获得了良好的可伸缩性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号