首页> 外文期刊>Information Processing & Management >Analysis of Web page image tag distribution characteristics
【24h】

Analysis of Web page image tag distribution characteristics

机译:网页图像标签分布特征分析

获取原文
获取原文并翻译 | 示例
           

摘要

The authors investigate the frequency distribution of the use of image tags in Web pages. Using data sampled from top level Web pages across five top level domains and from sample pages within individual websites, the authors model observed patterns in the frequency of image tag usage by fitting collected data distributions to different theoretical models used in informetrics. Models tested include the modified power law (MPL), Mandelbrot (MDB), generalized waring (GW), generalized inverse Gaussian-Poisson (GIGP), and generalized negative binomial (GNB) distributions. The GIGP provided the best fit for data sets for top level pages across the top level domains tested. The poor fits of the models to the observed data distributions from specific websites were due to the multimodal nature of the observed data sets. Mixtures of the tested models for the data sets provided better fits. The ability to effectively model Web page attributes, such as the distribution of the number of image tags used per page, is needed for accurate simulation models of Web page content, and makes it possible to estimate the number of requests needed to display the complete content of Web pages. (c) 2004 Elsevier Ltd. All rights reserved.
机译:作者研究了网页中使用图像标签的频率分布。通过使用从五个顶级域中的顶级网页以及各个网站中的示例页面中采样的数据,作者通过将收集的数据分布与信息计量学中使用的不同理论模型进行拟合,对观察到的图像标签使用频率进行建模。测试的模型包括修正的幂定律(MPL),Mandelbrot(MDB),广义警告(GW),广义逆高斯-泊松(GIGP)和广义负二项式(GNB)分布。 GIGP最适合测试的顶级域中顶级页面的数据集。模型与特定网站的观测数据分布的拟合度较差,这是由于观测数据集的多峰性质所致。数据集的测试模型的混合提供了更好的拟合度。需要有效地对网页属性进行建模的能力,例如每页使用的图像标签数量的分布,才能对网页内容进行精确的仿真模型,并使其能够估计显示完整内容所需的请求数量网页。 (c)2004 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号