首页> 外文期刊>International Journal of Applied Engineering Research >Implementation of Multilevel Indexing in Search Engines
【24h】

Implementation of Multilevel Indexing in Search Engines

机译:搜索引擎中多级索引的实现

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Indexing in search engines has been an active area of current researches. Granting efficient and fast access to the index is a key issue for performance of Web Search Engines. Search Engines use inverted indexes that consist of an array of posting lists where each posting list is associated with a term and contains the identifiers of the documents containing the term. A global single large size index takes a long time to search the relevant documents in response to the user queries. This paper proposes multilevel indexing algorithm i.e. creating the indexes at the multiple levels so as to reduce the search time by directing the search to a specific path from higher level indexes to the lower level indexes. The multilevel indexes are created after the hierarchical clusters of the documents have been formed on the basis of document similarity. At each level of documents clusters hierarchy, a separate index is created and the search for a document proceeds from the higher level indexes to the lowest level index which is the document level index and is stored as inverted file. Thus the paper implements the proposed hierarchical algorithm at different levels which optimizes the search process by directing the search to a specific path from higher levels of clustering to the lower levels i.e. from super clusters to mega clusters, then to clusters and finally to the individual documents so that the user gets the best possible matching results in minimum possible time. The paper further presents the graphs showing clusters created at different levels.
机译:搜索引擎中的索引已成为当前研究的活跃领域。授予对索引的有效和快速访问是Web搜索引擎性能的关键问题。搜索引擎使用倒排索引,该倒排索引由发布列表数组组成,其中每个发布列表都与一个术语相关联,并包含包含该术语的文档的标识符。响应于用户查询,全局单个大型索引需要很长时间来搜索相关文档。本文提出了一种多级索引算法,即在多个级别上创建索引,以通过将搜索定向到从较高级别索引到较低级别索引的特定路径来减少搜索时间。在基于文档相似性形成文档的层次簇之后,创建多级索引。在文档簇层次结构的每个级别上,都会创建一个单独的索引,并且对文档的搜索将从更高级别的索引进行到最低级别的索引,该最低级别的索引是文档级别的索引,并存储为反向文件。因此,本文在不同级别上实现了所提出的分层算法,该算法通过将搜索定向到从较高级别的聚类到较低级别的特定路径(即从超级聚类到大型聚类,再到聚类,最后到单个文档)来优化搜索过程以便用户在尽可能短的时间内获得最佳匹配结果。本文还提供了显示在不同级别创建的群集的图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号