首页> 外文会议>European Conference on Principles and Practice of Knowledge Discovery in Databases; 20040920-24; Pisa(IT) >Experimenting SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets
【24h】

Experimenting SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets

机译:实验SnakeT:网页摘要的分层聚类引擎

获取原文
获取原文并翻译 | 示例

摘要

Current search engines return a ranked list of web pages represented by page excerpts called the web snippets. The ranking is computed according to some relevance criterium that takes into account textual and hyperlink information about the web pages (see e.g. [1]). This approach is very well-known and a lot of research is pushing towards the design of better and faster ranking criteria. However, it is nowadays equally known that a flat list of results limits the retrieval of precise answers because of many factors. First, the relevance of the query results is a subjective and time-varying concept that strictly depends on the context in which the user is formulating the query. Second, the ever growing web is enlarging the number and heterogeneity of candidate query answers. Third, the web users have limited patience so that they usually just look at the top ten results. The net outcome of this scenario is that the retrieval of the correct answer by a standard user is getting more and more difficult, if not impossible. It is therefore not surprising that new IR tools are being designed to boost, or complement, the efficacy of search-engine ranking algorithms. These tools offer new ways of organizing and presenting the query results that are more intuitive and simple to be browsed, so that the users may match their needs faster. Among the various proposals, one became recently popular thanks to the engine Vivisimo (see Figure 1) that got in the last three years the "Best Metasearch Engine Award" by SearchEngineWatch.com.
机译:当前的搜索引擎返回由网页摘录表示的网页排名列表,称为网页摘要。根据考虑到有关网页的文本和超链接信息的一些相关性标准来计算排名(参见例如[1])。这种方法是众所周知的,并且许多研究正在推动设计更好和更快的排名标准。但是,如今同样众所周知的是,由于许多因素,平坦的结果列表会限制对精确答案的检索。首先,查询结果的相关性是一个主观且随时间变化的概念,严格取决于用户制定查询的上下文。第二,不断发展的网络正在扩大候选查询答案的数量和种类。第三,网络用户的耐心有限,因此他们通常只查看前十名的结果。这种情况的最终结果是,即使不是不可能,标准用户检索正确答案的难度也越来越大。因此,设计新的IR工具来增强或补充搜索引擎排名算法的效果就不足为奇了。这些工具提供了组织和显示查询结果的新方式,这些方式更直观,更易于浏览,从而使用户可以更快地满足其需求。在各种建议中,最近一项受到引擎Vivisimo的欢迎(见图1),该引擎在过去三年中获得了SearchEngineWatch.com的“最佳Metasearch引擎奖”。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号