首页> 外文会议>IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies >A Dynamic Approach to the Website Boundary Detection Problem Using Random Walks
【24h】

A Dynamic Approach to the Website Boundary Detection Problem Using Random Walks

机译:基于随机游走的网站边界检测问题的动态方法

获取原文

摘要

This paper presents an investigation into the Website Boundary Detection (WBD) problem in the dynamic context. In the dynamic context (as opposed to the static context) the web data to be considered is not fully available prior to the start of the website boundary detection process. The dynamic approaches presented in this paper are all probabilistic and based on the concept of random walks, three variations are considered: (i) the standard Random Walk (RW), (ii) a Self Avoiding RW and (iii) the Metropolis Hastings RW. The reported evaluation demonstrates that the proposed technique produces good WBD solutions while at the same time reducing the amount of "noise" pages visited. The best performing variation was found to be a Metropolis Hastings RW.
机译:本文对动态环境下的网站边界检测(WBD)问题进行了研究。在动态上下文(相对于静态上下文)中,要考虑的Web数据在网站边界检测过程开始之前并不完全可用。本文提出的动态方法都是概率性的,并且基于随机游走的概念,考虑了三种变体:(i)标准随机游走(RW),(ii)自回避RW和(iii)Metropolis Hastings RW 。报告的评估表明,所提出的技术可产生良好的WBD解决方案,同时减少了访问的“噪音”页面数量。发现性能最佳的变体是Metropolis Hastings RW。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号