【24h】

Distributed SLCA-Based XML Keyword Search by Map-Reduce

机译:通过Map-Reduce分布式基于SLCA的XML关键字搜索

获取原文

摘要

Large scales of XML information comes continually from new Web applications, and SLCA (Smallest Lowest Common Ancestor)-based XML keyword search is one of the most important information retrieval approaches. Previous approaches focus on building index for XML documents. However in information dissemination scenario, it is impossible to build index in advance for continuous XML document streams. This paper addresses SLCA-based keyword search for continuous XML documents by Map-Reduce mechanism. We use parallel algorithms to process plenty of XML documents in Hadoop environment. A distributed SLCA computation method is designed, where each net node computes SLCA independently and just a little information needs be transmitted. A real Hadoop environment is built and we demonstrate the efficiency of our algorithms analytically and experimentally.
机译:新的Web应用程序不断产生大量的XML信息,基于SLCA(最小的最低共同祖先)的XML关键字搜索是最重要的信息检索方法之一。先前的方法着重于为XML文档建立索引。但是,在信息传播场景中,不可能为连续的XML文档流预先建立索引。本文通过Map-Reduce机制解决了基于SLCA的连续XML文档的关键字搜索。我们使用并行算法在Hadoop环境中处理大量XML文档。设计了一种分布式SLCA计算方法,其中每个网络节点独立地计算SLCA,仅需要传输少量信息。构建了一个真实的Hadoop环境,我们通过分析和实验证明了算法的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号