首页> 外文会议>IEEE/WIC/ACM International Conference on Intelligent Agent Technology >Board Forum Crawling: A Web Crawling Method for Web Forum
【24h】

Board Forum Crawling: A Web Crawling Method for Web Forum

机译:董事会论坛爬行:网络论坛的Web爬网方法

获取原文

摘要

We present a new method of Board Forum Crawling to crawl Web forum. This method exploits the organized characteristics of the Web forum sites and simulates human behavior of visiting Web Forums. The method starts crawling from the homepage, and then enters each board of the site, and then crawls all the posts of the site directly. Board Forum Crawling can crawl most meaningful information of a Web forum site efficiently and simply. We experimentally evaluated the effectiveness of the method on real Web forum sites by comparing with the traditional breadth-first crawling. We also used this method in a real project, and 12000 Web forum sites have been crawled successfully. These results show the effectiveness of our method.
机译:我们展示了一种新的董事会论坛爬行方法来爬网论坛。该方法利用网络论坛网站的有组织特征,并模拟访问Web论坛的人类行为。该方法从主页开始爬行,然后进入站点的每个板,然后直接爬网站的所有帖子。董事会论坛爬网可以有效,简单地爬网可以爬网网上论坛网站的最有意义的信息。我们通过与传统宽度爬行进行比较,通过比较来实验评估了真实网络论坛网站上的方法。我们还在实际项目中使用此方法,12000个网络论坛网站已成功爬行。这些结果表明了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号