...
首页> 外文期刊>ACM Transactions on Internet Technology >Constructing Novel Block Layouts for Webpage Analysis
【24h】

Constructing Novel Block Layouts for Webpage Analysis

机译:构建用于网页分析的新型块布局

获取原文
获取原文并翻译 | 示例
           

摘要

Webpage segmentation is the basic building block for a wide range of webpage analysis methods. The rapid development of Web technologies results in more dynamic and complex webpages, which bring new challenges to this area. To improve the performance of webpage segmentation, we propose a two-stage segmentation method that can combine visual, logic, and semantic features of the contents on a webpage. Specifically, we devise a new model to measure the similarities of the elements on webpages based on both visual layout and logic organization in the first stage, and we propose a novel block regrouping method using semantic statistics and visual positions in the second stage. This two-stage method can effectively conduct webpage segmentation on complicated and dynamic webpages. The performance and accuracy of the method are verified by comparing with two existing webpage segmentation methods. The experiment results show that the proposed method significantly outperforms the existing state of the art in terms of higher precision, recall, and accuracy.
机译:网页分割是各种网页分析方法的基本构建块。 Web技术的快速发展导致了更具活力和复杂的网页,对该地区带来了新的挑战。为了提高网页分割的性能,我们提出了一种两级分割方法,可以将内容的视觉,逻辑和语义特征组合在网页上。具体而言,我们设计了一种新模型,以根据第一阶段的视觉布局和逻辑组织来衡量网页上的元素的相似性,并且我们提出了一种使用第二阶段中的语义统计和视觉位置的新颖块重新组合方法。这种两级方法可以有效地对复杂和动态网页进行网页分段。通过与两个现有的网页分段方法进行比较,验证了该方法的性能和准确性。实验结果表明,在更高的精度,召回和准确性方面,该方法的方法显着优于现有的现有技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号