首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >BloomCast: Efficient and Effective Full-Text Retrieval in Unstructured P2P Networks
【24h】

BloomCast: Efficient and Effective Full-Text Retrieval in Unstructured P2P Networks

机译:BloomCast:非结构化P2P网络中的高效有效的全文检索

获取原文
获取原文并翻译 | 示例

摘要

Efficient and effective full-text retrieval in unstructured peer-to-peer networks remains a challenge in the research community. First, it is difficult, if not impossible, for unstructured P2P systems to effectively locate items with guaranteed recall. Second, existing schemes to improve search success rate often rely on replicating a large number of item replicas across the wide area network, incurring a large amount of communication and storage costs. In this paper, we propose BloomCast, an efficient and effective full-text retrieval scheme, in unstructured P2P networks. By leveraging a hybrid P2P protocol, BloomCast replicates the items uniformly at random across the P2P networks, achieving a guaranteed recall at a communication cost of O(sqrt{N}), where N is the size of the network. Furthermore, by casting Bloom Filters instead of the raw documents across the network, BloomCast significantly reduces the communication and storage costs for replication. We demonstrate the power of BloomCast design through both mathematical proof and comprehensive simulations based on the query logs from a major commercial search engine and NIST TREC WT10G data collection. Results show that BloomCast achieves an average query recall of 91 percent, which outperforms the existing WP algorithm by 18 percent, while BloomCast greatly reduces the search latency for query processing by 57 percent.
机译:在非结构化对等网络中高效和有效的全文检索仍然是研究界的挑战。首先,非结构化的P2P系统很难(即使不是不可能)有效地定位具有保证的召回条件的项目。其次,提高搜索成功率的现有方案通常依赖于在广域网中复制大量项目副本,从而产生大量的通信和存储成本。在本文中,我们提出了一种在非结构化P2P网络中有效且有效的全文检索方案BloomCast。通过利用混合P2P协议,BloomCast在P2P网络上随机地均匀复制这些项目,从而以O(sqrt {N})的通信成本实现了保证的召回率,其中N是网络的大小。此外,通过在网络上投射Bloom Filters而不是原始文档,BloomCast大大降低了复制所需的通信和存储成本。我们基于主要商业搜索引擎的查询日志和NIST TREC WT10G数据收集,通过数学证明和全面的模拟展示了BloomCast设计的强大功能。结果显示,BloomCast的平均查询召回率为91%,比现有的WP算法高出18%,而BloomCast大大降低了查询处理的搜索延迟57%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号