FOCUSING WEB CRAWLS ON LOCATION-SPECIFIC CONTENT

机译：专注于特定于位置的内容的Web爬网

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Retrieving relevant data for location-sensitive keyword queries is a challenging task that has so far been addressed as a problem of automatically determining the geographical orientation of web searches. Unfortunately, identifying localizable queries is not sufficient per se for performing successful location-sensitive searches, unless there exists a geo-referenced index of data sources against which localizable queries are searched. In this paper, we propose a novel approach towards the automatic construction of a geo-referenced search engine index. Our approach relies on a geo-focused crawler that incorporates a structural parser and uses GeoWordNet as a knowledge base in order to automatically deduce the geo-spatial information that is latent in the pages' contents. Based on location-descriptive elements in the page URLs and anchor text, the crawler directs the pages to a location-sensitive downloader. This downloading module resolves the geographical references of the URL location elements and organizes them into indexable hierarchical structures. The location-aware URL hierarchies are linked to their respective pages, resulting into a georeferenced index against which location-sensitive queries can be answered.

机译：检索位置敏感关键字查询的相关数据是一个具有挑战性的任务，迄今已被解决是自动确定Web搜索的地理位方向的问题。遗憾的是，识别可定位查询本身不足以执行成功的位置敏感搜索，除非存在对搜索可定位查询的地理参考索引的地理参考索引。在本文中，我们提出了一种新颖的旨在自动构建地理参考搜索引擎指数的方法。我们的方法依赖于搭便的地理困境，其中包含一个结构解析器，并使用GeoWordnet作为知识库，以便自动推断出在页面内容中潜伏的地理空间信息。基于页面URL和锚文本中的位置描述性元素，爬网程序将页面指向位置敏感下载器。此下载模块解析了URL位置元素的地理引用，并将它们组织成可索引的分层结构。位置感知URL层次结构链接到其各自的页面，从而导致地理位置索引可以回答哪个位置敏感查询。

著录项

来源
《International Conference on Web Information Systems and Technologies》|2009年||共6页
会议地点
作者
Lefteris Kozanidis; Sofia Stamou; George Spiros;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Location-sensitive web search; Focused crawling; Geo-referenced index;

机译：位置敏感的网络搜索;聚焦爬行;地理参考索引;

相似文献

外文文献
中文文献
专利

1. Tunneling enhanced by web page content block partition for focused crawling [J] . Tao Peng, Changli Zhang, Wanli Zuo Concurrency and Computation . 2008,第1期

机译：网页内容块分区增强了隧道功能，可进行集中爬网
2. Keyword weight optimization using gradient strategies in event focused web crawling [J] . Rajiv S., Navaneethan C. Pattern recognition letters . 2021,第Feba期

机译：关键词权重优化在活动中使用渐变策略的重点策略
3. FOCUSED WEB CRAWLING FOR HIGH PERFORMANCE SEARCH ENGINES: ISSUES, TECHNIQUES AND SYSTEMS [J] . SUSHIL KUMAR, NARESH CHAUHAN International journal of computational intelligence theory and practice . 2020,第1期

机译：专注于高性能搜索引擎的Web爬网：问题，技术和系统
4. FOCUSING WEB CRAWLS ON LOCATION-SPECIFIC CONTENT [C] . Lefteris Kozanidis, Sofia Stamou, George Spiros International Conference on Web Information Systems and Technologies . 2009

机译：专注于特定于位置的内容的Web爬网
5. Connecting link structure and content on the Web for effective focused crawling. [D] . Nickerson, Adam Stuart. 2003

机译：连接Web上的链接结构和内容，以进行有效的集中爬网。
6. Domain adaptation of statistical machine translation with domain-focused web crawling [O] . Pavel Pecina, Antonio Toral, Vassilis Papavassiliou, -1

机译：统计机器翻译的领域适应和以领域为中心的网络爬网
7. Tunneling enhanced by web page content block partition for focused crawling [O] . Tao Peng, Changli Zhang, Wanli Zuo 2010

机译：隧道通过网页内容块分区增强，用于聚焦爬网
8. Focused Crawling of the Deep Web Using Service Class Descriptions [R] . Rocco, D., Liu, L., Critchlow, T. 2005

机译：使用服务类描述重点对Deep Web进行爬网

FOCUSING WEB CRAWLS ON LOCATION-SPECIFIC CONTENT

摘要

著录项

相似文献

相关主题

期刊订阅