首页> 外文会议>Tools with Artificial Intelligence, ICTAI, 2008 20th IEEE International Conference on >Profile-Based Focused Crawler for Social Media-Sharing Websites
【24h】

Profile-Based Focused Crawler for Social Media-Sharing Websites

机译:基于个人资料的针对社交媒体共享网站的抓取工具

获取原文

摘要

In this paper, we present a novel profile based focused crawling system for dealing with increasingly popular social media-sharing web sites. In this system, we treat users' profiles as ranking criteria for guiding the crawling process. Furthermore, we divide a user's profile into two parts, an internal part, which comes from the user's own contribution, and an external part, which comes from the user's social contacts. In order to efficiently and effectively extract data from a social media-sharing website for focused crawling, a path string based page-classification method was first developed for identifying list pages, detail pages and profile pages.
机译:在本文中,我们提出了一种新颖的基于配置文件的集中爬网系统,用于处理日益流行的社交媒体共享网站。在此系统中,我们将用户的个人资料视为指导爬网过程的排名标准。此外,我们将用户的个人资料分为两部分:内部部分(来自用户自己的贡献)和外部部分(来自用户的社交联系人)。为了从社交媒体共享网站有效地提取数据以进行有针对性的爬网,首先开发了一种基于路径字符串的页面分类方法,用于识别列表页面,详细信息页面和个人资料页面。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号