Research and design of the crawler system in a vertical search engine

机译：垂直搜索引擎中爬虫系统的研究与设计

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The crawler system in a vertical search engine should format a representative sample web page so at to make sure that the page could meet the W3C standards, which make it available that the processed page can be resolved by the visual XPath generator and then the desired XPath value will be found out. In batch-data-extraction, some exact data will be available when object web pages are parsed by the crawler system. A vertical search engine can extract the necessary data and segment Chinese words at first, and then the data will be presented on web pages. The data structuring process after the data extraction distinguishes a vertical search engine from a traditional search engine. The crawler system that can extract professional information on the Internet and process the information preliminarily is an indispensable part of a vertical search engine.

机译：垂直搜索引擎中的搜寻器系统应设置代表性的示例网页的格式，以确保该网页符合W3C标准，从而使处理后的网页可以由可视XPath生成器解析，然后由所需的XPath解析。价值将被发现。在批处理数据提取中，当搜寻器系统解析对象网页时，将提供一些确切的数据。垂直搜索引擎可以首先提取必要的数据并分割中文单词，然后将这些数据显示在网页上。数据提取后的数据结构化过程将垂直搜索引擎与传统搜索引擎区分开来。可以在Internet上提取专业信息并进行初步处理的爬虫系统是垂直搜索引擎必不可少的部分。

著录项

来源
《International Conference on Intelligent Computing and Integrated Systems》|2010年|P.790-792|共3页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
Vertical search engine; crawler system; data extraction; data structuring process; visual XPath generator; word segmentation;

机译：垂直搜索引擎;爬虫系统;数据提取;数据结构化过程;可视化XPath生成器;分词;
入库时间 2022-08-26 14:20:34

相似文献

外文文献
中文文献
专利

1. Design of a Least Cost (LC) Vertical Search Engine based on Domain Specific Hidden Web Crawler [J] . Sudhakar Ranjan, Komal Kumar Bhatia International journal of information retrieval research . 2017,第2期

机译：基于特定于域的隐藏Web爬虫的最低成本（LC）垂直搜索引擎的设计
2. Site Design Impact on Robots: An Examination of Search Engine Crawler Behavior at Deep and Wide Websites [J] . D-lib magazine . 2008,第14期

机译：网站设计对机器人的影响：深度和广泛网站上搜索引擎爬虫行为的检查
3. Improving the freshness of the search engines by a probabilistic approach based incremental crawler [J] . Pavai G. d, Geetha T. V. Information systems frontiers . 2017,第5期

机译：通过基于概率的增量爬虫提高搜索引擎的新鲜度
4. Research and design of the crawler system in a vertical search engine [C] . {missing} International Conference on Intelligent Computing and Integrated Systems . 2010

机译：垂直搜索引擎中履带系统的研究与设计
5. Towards next generation vertical search engines. [D] . Zheng, Li. 2014

机译：面向下一代垂直搜索引擎。
6. Quantitative evaluation of recall and precision of CAT Crawler a search engine specialized on retrieval of Critically Appraised Topics [O] . Peng Dong, Ling Ling Wong, Sarah Ng, 2004

机译：CAT Crawler的召回率和准确性的定量评估CAT Crawler是专门检索关键评估主题的搜索引擎
7. Design and Implementation of Scalable, Fully Distributed Web Crawler for a Web Search Engine [O] . M. Sunil Kumar 2011

机译：Web搜索引擎的可扩展，完全分布式Web爬网程序的设计和实现

Research and design of the crawler system in a vertical search engine

摘要

著录项

相似文献

相关主题

期刊订阅