首页> 外文期刊>Indian Journal of Science and Technology >Wavelet Tree based Hybrid Geo-Textual Indexing Technique for Geographical Search
【24h】

Wavelet Tree based Hybrid Geo-Textual Indexing Technique for Geographical Search

机译:基于小波树的混合地理文本索引技术

获取原文
           

摘要

Background/Objectives: There is significant commercial and research interest in location based search for search engines. Searching of keywords belonging to one or more locations (geographic references) requires geographical web search and ranking on the basis of spatial and textual relevancy. This type of search sets the requirement of spatial and textual indexing. Methods/Statistical Analysis: This paper uses a new spatial-textual hybrid indexing technique, based on Wavelet Tree (WT) to handle point and region queries for Geographical Information Retrieval. Here, WT data structure is used for both textual and spatial indexing. Minimum Bounding Rectangles (MBRs) of different geographical points (latitude, longitude) is created for designing hybrid index. For searching textual keywords, we need to design inverted index. It is created using wavelet tree. Also, a spatial-textual relevancy scheme is used for relevant document retrieval to the end users. Findings: The algorithm has been implemented in order to measure the performance in terms of search time. Approximately 40,000 Wikipedia pages have been crawled and stored in database along with geographical coordinates (latitude, longitude) of locations in India to design MBRs of these locations. The results show that wavelet tree based hybrid index algorithm performance increase with the increase in query length. For small query length, B/R* tree performs better but for larger query lengths, wavelet tree based hybrid index outperforms other techniques. Precision and recall of web documents have also been calculated using hybrid index. For varying query lengths, the precision and recalls are varying which shows that by reducing the time in search time precision and recall are preserve. Applications/Improvement: Our algorithm outperforms the existing algorithms both in terms of simplicity in implementation and searching time. In future we will propose a compression technique on hybrid index to minimize the space taken by hybrid index that will further improve the searching time in case of single as well as multiple geographical references of documents.
机译:背景/目的:基于位置的搜索引擎搜索具有巨大的商业和研究兴趣。搜索属于一个或多个位置(地理参考)的关键字需要地理网络搜索和基于空间和文本相关性的排名。这种搜索设置了空间索引和文本索引的要求。方法/统计分析:本文使用一种基于小波树(WT)的新的空间文本混合索引技术来处理点和区域查询,以获取地理信息。此处,WT数据结构用于文本索引和空间索引。创建不同地理位置(纬度,经度)的最小边界矩形(MBR)来设计混合索引。为了搜索文本关键字,我们需要设计倒排索引。它是使用小波树创建的。此外,空间文本相关性方案用于与最终用户检索相关文档。结果:该算法已实现,目的是根据搜索时间衡量性能。大约40,000个Wikipedia页面以及印度位置的地理坐标(纬度,经度)已被爬取并存储在数据库中,以设计这些位置的MBR。结果表明,基于小波树的混合索引算法性能随着查询长度的增加而提高。对于较小的查询长度,B / R *树的性能较好,但对于较大的查询长度,基于小波树的混合索引的性能优于其他技术。使用混合索引还可以计算网络文档的准确性和召回率。对于不同的查询长度,精度和查全率是变化的,这表明通过减少搜索时间可以保留精度和查全率。应用/改进:我们的算法在实现的简便性和搜索时间方面均优于现有算法。将来,我们将提出一种基于混合索引的压缩技术,以最大程度地减少混合索引所占用的空间,这将在单个或多个地理参考文献的情况下进一步缩短搜索时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号