首页> 外文期刊>Information Processing & Management >Finding similar academic Web sites with links, bibliometric couplings and colinks
【24h】

Finding similar academic Web sites with links, bibliometric couplings and colinks

机译:使用链接,文献计量耦合和共链接查找相似的学术网站

获取原文
获取原文并翻译 | 示例
       

摘要

A common task in both Webmetrics and Web information retrieval is to identify a set of Web pages or sites that are similar in content. In this paper we assess the extent to which links, colinks and couplings can be used to identify similar Web sites. As an experiment, a random sample of 500 pairs of domains from the UK academic Web were taken and human assessments of site similarity, based upon content type, were compared against ratings for the three concepts. The results show that using a combination of all three gives the highest probability of identifying similar sites, but surprisingly this was only a marginal improvement over using links alone. Another unexpected result was that high values for either colink counts or couplings were associated with only a small increased likelihood of similarity. The principal advantage of using couplings and colinks was found to be greater coverage in terms of a much larger number of pairs of sites being connected by these measures, instead of increased probability of similarity. In information retrieval terminology, this is improved recall rather than improved precision.
机译:Webmetrics和Web信息检索中的一项常见任务是识别内容相似的一组Web页面或站点。在本文中,我们评估了链接,共链接和耦合可用于识别相似网站的程度。作为一项实验,从英国学术网站上随机抽取了500对域名,并将基于内容类型的网站相似性的人工评估与这三个概念的评级进行了比较。结果表明,使用所有这三种方法的组合,识别相似站点的可能性最高,但是令人惊讶的是,与仅使用链接相比,这只是一个很小的改进。另一个出乎意料的结果是,共链接数或耦合数的高值仅与相似性的增加可能性很小。发现通过使用耦合和共链接的主要优点是,通过这些措施将大量的站点对连接起来,覆盖范围更大,而不是增加相似性的可能性。在信息检索术语中,这提高了查全率,而不是提高了准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号