【24h】

Quality Information Retrieval for the World Wide Web

机译:万维网的质量信息检索

获取原文
获取外文期刊封面目录资料

摘要

The World Wide Web is an unregulated communication medium which exhibits very limited means of quality control. Quality assurance has become a key issue for many information retrieval services on the Internet, e.g. web search engines. This paper introduces some quality evaluation and assessment methods to assess the quality of web pages. The proposed quality evaluation mechanisms are based on a set of quality criteria which were extracted from a targeted user survey. A weighted algorithmic interpretation of the most significant user quoted quality criteria is proposed. In addition, the paper utilizes machine learning methods to produce a prediction of quality for web pages before they are downloaded. The set of quality criteria allows us to implement a web search engine with quality ranking schemes, leading to web crawlers which can crawl directly quality web pages. The proposed approaches produce some very promising results on a sizeable web repository.
机译:万维网是一个不受管制的通信介质,其呈现非常有限的质量控制手段。质量保证已成为互联网上许多信息检索服务的关键问题,例如,网络搜索引擎。本文介绍了一些质量评估和评估方法,以评估网页的质量。所提出的质量评估机制基于一系列质量标准,该质量标准从目标用户调查中提取。提出了对最重要的用户引用质量标准的加权算法解释。此外,本文利用机器学习方法在下载之前对网页的质量预测。该集合标准允许我们使用质量排名方案实现Web搜索引擎,导致Web爬虫,可以抓取直接质量的网页。拟议的方法在相当大的Web存储库上产生一些非常有前途的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号