Focused Crawling of the Deep Web Using Service Class Descriptions

机译：使用服务类描述重点对Deep Web进行爬网

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Dynamic Web data sources--sometimes known collectively as the Deep Web--increase the utility of the Web by providing intuitive access to data repositories anywhere that Web access is available. Deep Web services provide access to real-time information, like entertainment event listings, or present a Web interface to large databases or other data repositories. Recent studies suggest that the size and growth rate of the dynamic Web greatly exceed that of the static Web, yet dynamic content is often ignored by existing search engine indexers owing to the technical challenges that arise when attempting to search the Deep Web. To address these challenges, we present DynaBot, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DynaBot has three unique characteristics. First, DynaBot utilizes a service class model of the Web implemented through the construction of service class descriptions (SCDs). Second, DynaBot employs a modular, self-tuning system architecture for focused crawling of the Deep Web using service class descriptions. Third, DynaBot incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis.

著录项

作者
Rocco, D.; Liu, L.; Critchlow, T.;
展开▼
作者单位

展开▼
年度 2005
页码 p.1-16
总页数 16
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
Bioinformatics; Web services; World Wide Web; Data sources; Algorithms; Computer programs; Access; Resources; Classification; Data systems; Data sets; Interfaces; Information systems;

机译：生物信息学;网络服务;万维网;数据源;算法;计算机程序;访问;资源;分类;数据系统;数据集;接口;信息系统;

相似文献

外文文献
中文文献
专利

1. Application of rough ensemble classifier to web services categorization and focused crawling [J] . Suman Saha, C.A. Murthy, Sankar K. Pal Web Intelligence and Agent Systems . 2010,第1期

机译：粗集成分类器在Web服务分类和集中爬网中的应用
2. A web page distillation strategy for efficient focused crawling based on optimized Naive bayes (ONB) classifier [J] . Saleh Ahmed I., Abulwafa Arwa E., Al Rahmawy Mohammed F. Applied Soft Computing . 2017,第期

机译：基于优化的Naive Bayes（ONB）分类器的高效聚焦爬网的网页蒸馏策略
3. Focused Crawling for Automatic Service Discovery, Annotation, and Classification in Industrial Digital Ecosystems [J] . Dong H., Hussain F. K. Industrial Electronics, IEEE Transactions on . 2011,第6期

机译：针对工业数字生态系统中的自动服务发现，注释和分类的集中爬网
4. Focused Deep Web Entrance Crawling by Form Feature Classification [C] . Lin Wang, Ammar Hawbani, Xingfu Wang International Conference on Big Data Computing and Communications . 2015

机译：通过表单特征分类集中进行深层Web入口爬网
5. Connecting link structure and content on the Web for effective focused crawling. [D] . Nickerson, Adam Stuart. 2003

机译：连接Web上的链接结构和内容，以进行有效的集中爬网。
6. Domain adaptation of statistical machine translation with domain-focused web crawling [O] . Pavel Pecina, Antonio Toral, Vassilis Papavassiliou, -1

机译：统计机器翻译的领域适应和以领域为中心的网络爬网
7. iCrawl: Improving the Freshness of Web Collections by Integrating Social Web and Focused Web Crawling [O] . Gossen, Gerhard, Demidova, Elena, Risse, Thomas 2016

机译：iCrawl：通过整合社交网络来提高网络集合的新鲜度网络和聚焦网络爬行

Focused Crawling of the Deep Web Using Service Class Descriptions

摘要

著录项

相似文献

相关主题

期刊订阅