Intelligent Crawling On Open Web for Business Prospects

Bharat Bhushan; Narender Kumar

首页> 外文期刊>International journal of computer science and network security >Intelligent Crawling On Open Web for Business Prospects

【24h】

Intelligent Crawling On Open Web for Business Prospects

机译：在开放Web上进行智能爬网以实现业务前景

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Dynamic nature of web based systems requires continuous system updating. Information retrieval depends upon crawlers that crawl the web exhaustively, but business corporates expect from their crawlers to retrieve the specific information as per their applications. Crawlers help to download the required information using hyperlinks that occur in Web pages but the information is usually partial & fails to fulfill user's aspirations. To retrieve updated information from one single link/url is very simple but if many urls give the same information, it becomes difficult to analyze which url/link is giving desired, sufficient, updated & up to date information. Moreover, it becomes difficult how to remove duplicate stories from same link domain. In the present paper attempt has been made to discuss the issues related to intelligent crawling by proposing various techniques to assist the scenario concerned with web mining for business prospects.

机译：基于Web的系统的动态性质要求不断进行系统更新。信息检索取决于对网络进行详尽爬网的爬网程序，但是商业公司希望爬网程序根据其应用程序检索特定信息。爬网程序使用Web页面中出现的超链接来帮助下载所需的信息，但是该信息通常是不完整的，无法满足用户的期望。从单个链接/ URL检索更新的信息非常简单，但是如果许多URL提供相同的信息，则很难分析哪个URL /链接提供了所需的，足够的，更新的和最新的信息。此外，如何从相同的链接域中删除重复的故事变得困难。在本文中，已尝试通过提出各种技术来协助与业务潜在客户进行Web挖掘有关的方案来讨论与智能爬网有关的问题。

著录项

来源
《International journal of computer science and network security》 |2012年第6期|p.93-98|共6页
作者
Bharat Bhushan; Narender Kumar;
展开▼
作者单位

Department of Computer Science & Applications, Guru Nanak Khalsa College, Yamuna Nagar (Haryana), India;

Narender Kumar, Research Scholar, Enrollment No. 1050103206 ,Shighania University , V.P.O. - Pacheri Bari, Dist.Jhunjhunu, Rajasthan - [INDIA];

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
web crawler; latency; ethics; reliability; longevity;

机译：网络爬虫;潜伏;伦理;可靠性;长寿;

相似文献

外文文献
中文文献
专利

1. Intelligent Crawling On Open Web for Business Prospects [J] . Bharat Bhushan, Narender Kumar International journal of computer science and network security . 2012,第6期

机译：在开放Web上进行智能爬网以实现业务前景
2. Crawl-based analysis of web applications: Prospects and challenges [J] . Arie van Deursen, Ali Mesbah, Alex Nederlof Science of Computer Programming . 2015,第pta1期

机译：基于爬网的Web应用程序分析：前景与挑战
3. Web Crawling and Processing with Limited Resources for Business Intelligence and Analytics Applications [J] . Loredana M. Genovese, Filippo Geraci Journal of software . 2018,第5期

机译：资源有限的Web爬网和处理，用于商业智能和分析应用程序
4. Intelligent and Adaptive Crawling of Web Applications for Web Archiving [C] . Muhammad Faheem, Pierre Senellart International conference on web engineering . 2013

机译：用于Web归档的Web应用程序的智能和自适应爬网
5. Crawling the Web: Discovery and maintenance of large-scale Web data. [D] . Cho, Junghoo. 2002

机译：爬行Web：发现和维护大规模Web数据。
6. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling [O] . R. Suganya Devi, D. Manjula, R. K. Siddharth 2015

机译：通过Web爬网中的超链接对大数据进行Web索引的一种有效方法
7. Intelligent and adaptive crawling of Web applications for Web archiving [O] . Muhammad Faheem, Pierre Senellart 2013

机译：用于Web归档的Web应用程序的智能和自适应爬网

Intelligent Crawling On Open Web for Business Prospects

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅