首页> 外文会议>International conference on web information systems and technologies >Automatic Web Page Classification Using Visual Content for Subjective and Functional Variables
【24h】

Automatic Web Page Classification Using Visual Content for Subjective and Functional Variables

机译:使用视觉内容对主观和功能变量进行自动网页分类

获取原文

摘要

Automatic classification of webpages has several applications in industry: digital marketing, search engines, content filtering and many more. Traditionally this classification has been done using only the textual information of webpages, which includes the html code, tags, title and more lately also the url. The aim of this paper is to prove that for some subjective variables, although very important to the applications mentioned, the visual information of webpages as they are rendered by the browser has extremely rich content for the classification task. The variables studied are the aesthetic value (whether pages are beautiful or ugly) and the design recency of them (whether pages are old fashioned or look modern). We then proved that automatic classifications that rely only on the visual look and feel can achieve very high accuracies. As we used several low-level and mid-level features and studied several criteria for selection and classification, our classifiers were able to improve one step further the stat of the art. Finally, we applied this framework to classify webpages in their topic (content aware) and also to classify whether pages are a blog or not (functional aware).
机译:网页的自动分类在行业中有多种应用:数字营销,搜索引擎,内容过滤等等。传统上,这种分类是仅使用网页的文本信息来完成的,其中包括html代码,标签,标题以及最近的url。本文的目的是证明,对于某些主观变量,尽管对于所提到的应用程序非常重要,但浏览器呈现的网页视觉信息对于分类任务具有极其丰富的内容。研究的变量是美学价值(页面是漂亮还是丑陋)和它们的设计新近度(页面是老式还是看起来很现代)。然后,我们证明了仅依赖于视觉外观的自动分类可以实现很高的准确性。当我们使用几个低级和中级功能并研究了几种选择和分类标准时,我们的分类器能够进一步提高现有技术水平。最后,我们使用此框架对网页进行主题分类(内容感知),并对页面是否为博客分类(功能感知)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号