首页> 外文会议>International Conference on Advanced Networking and Applications >A Class-Based Search System in Unstructured P2P Networks
【24h】

A Class-Based Search System in Unstructured P2P Networks

机译:非结构化P2P网络中基于类的搜索系统

获取原文

摘要

Efficient searching is one of the important design issues in peer-to-peer (P2P) networks. Among various searching techniques, semantic-based searching has drawn significant attention recently. Gnutella-like efficient searching system (GES) [18] is such a system. GES derives a node vector, a semantic summary of all of the documents on a node, based on vector space model (VSM). The topology adaptation algorithm and search protocol are then designed according to the similarity between node vectors of different nodes. However, although GES is suitable when the distribution of documents in each node is uniform, it may not be efficient when the distribution is diverse. When there are many categories of documents at each node, the node vector representation may be inaccurate. We extend the idea of GES and present a class-based semantic searching system (CSS). It makes use of a data clustering algorithm, online spherical k-means clustering (OSKM) [16], to cluster all documents on a node into several classes. Each class can be viewed as a virtual node. Virtual nodes are connected through virtual links. As a result, class vector replaces node vector and plays an important role in the class-based topology adaptation and search process, which makes CSS very efficient. Our simulation using the IR benchmark TREC collection demonstrates that CSS outperforms GES in terms of higher recall, higher precision and lower search cost.
机译:高效搜索是点对点(P2P)网络中的重要设计问题之一。在各种搜索技术中,基于语义的搜索最近绘制了大量关注。类似GNUTELA的高效搜索系统(GES)[18]是这样的系统。 GES派生节点向量,基于矢量空间模型(VSM),节点上的所有文档的语义摘要。然后根据不同节点的节点向量之间的相似性设计拓扑自适应算法和搜索协议。但是,尽管当每个节点中的文档分布均匀时,但GES是合适的,但是当分配多样化时,可能不会有效。当每个节点上有许多类别的文档时,节点向量表示可能不准确。我们扩展了GES的想法并呈现了基于类的语义搜索系统(CSS)。它利用数据聚类算法,在线球面K-means群集(OSKM)[16],将节点上的所有文档集中成几个类。可以将每个类视为虚拟节点。虚拟节点通过虚拟链路连接。结果,类矢量替换节点向量并在基于类的拓扑适应和搜索过程中扮演重要作用,这使得CSS非常有效。我们使用IR基准TREC集合的模拟演示了CSS在更高的召回,更高的精度和更低的搜索成本方面优于GES。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号