The Design and Implementation of a Topic-Driven Crawler

机译：主题驱动履带的设计与实现

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is indispensable that the users surfing on the Internet could have web pages classified into a given topic as correct as possible. As a result, topic-driven crawlers are becoming important tools to support applications such as specialized web portals, online searching, and competitive intelligence. This paper presents a topic-driven crawler computing the degree of relevance and refining the preliminary set of related web pages using term frequency/document frequency, entropy, and compiled rules. This paper also gives a kind of comparatively ideal system architecture and the relationship of each module of a topic-driven crawler, and describes several modules on the details.

机译：在互联网上冲浪的用户可以将网页分类为适当的网页，尽可能正确。因此，主题驱动的爬虫正在成为支持专业网络门户网站，在线搜索和竞争智能等应用的重要工具。本文介绍了一个主题驱动的爬网程序，计算使用术语频率/文档频率，熵和编译规则的相关网页的相关性和炼制相关网页的程度。本文还提供了一种相对理想的系统架构和主题驱动爬虫的每个模块的关系，并描述了细节上的多个模块。

著录项

来源
《Workshop on Intelligent Information Technology Application》|2007年||共4页
会议地点
作者
Li Qiong; Jin Tao; Fu Yuchen; Liu Quan; Cui Zhiming; IITA;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Design and Implementation of Distributed Facebook Crawler Based on Interaction Simulation [J] . B.S Satpute, Raj Ambani, RohitRai, International Journal of Engineering Trends and Technology . 2014,第2期

机译：基于交互仿真的分布式Facebook爬虫的设计与实现
2. A High Efficient Incremental Microblog Crawler:Design and Implementation [J] . Dayong Shen, Hui Wang, Zhihong Jiang, Journal of information and computational science . 2013,第6期

机译：一种高效的增量式微博客爬虫：设计与实现
3. Design and Implementation of Training Simulator for Multi-purpose Construction Crawler [J] . LI Wen-hong, SUN Shao-wen, ZHANG Qi, 兵工学报（英文版） . 2008,第004期

机译：多功能施工履带训练模拟器的设计与实现
4. The Design and Implementation of a Topic-Driven Crawler [C] . Xiaoming Meng Intelligent Information Technology Application, 2007 Workshop on . 2007

机译：主题驱动的爬虫的设计与实现
5. Design and implementation of an intelligent Web crawler for corporate data scraping. [D] . Qin, Xinfeng. 2007

机译：用于企业数据抓取的智能Web搜寻器的设计和实现。
6. Harmonizing evidence-based practice implementation context and implementation strategies with user-centered design: a case example in young adult cancer care [O] . Emily R. Haines, Alex Dopp, Aaron R. Lyon, 2021

机译：通过以用户为本的设计协调基于证据的实践实施背景和实施策略：年轻成人癌症护理中的案例示例
7. Evaluating Topic-Driven Web Crawlers [O] . Filippo Menczer, Gautam Pant, Padmini Srinivasan, 2001

机译：评估主题驱动的Web爬网程序

The Design and Implementation of a Topic-Driven Crawler

摘要

著录项

相似文献

相关主题

期刊订阅