首页> 外文会议>European Conference on Principles and Practice of Knowledge Discovery in Databases >Experimenting SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets
【24h】

Experimenting SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets

机译:实验Snaket:用于网页代码段的分层聚类引擎

获取原文

摘要

Current search engines return a ranked list of web pages represented by page excerpts called the web snippets. The ranking is computed according to some relevance criterium that takes into account textual and hyperlink information about the web pages (see e.g. [1]). This approach is very well-known and a lot of research is pushing towards the design of better and faster ranking criteria. However, it is nowadays equally known that a flat list of results limits the retrieval of precise answers because of many factors. First, the relevance of the query results is a subjective and time-varying concept that strictly depends on the context in which the user is formulating the query. Second, the ever growing web is enlarging the number and heterogeneity of candidate query answers. Third, the web users have limited patience so that they usually just look at the top ten results. The net outcome of this scenario is that the retrieval of the correct answer by a standard user is getting more and more difficult, if not impossible.
机译:当前搜索引擎返回由名为Web代码段的页面excerpts表示的网页排名列表。根据一些相关性标准计算排名,该相关标准考虑了关于网页的文本和超链接信息(参见例如[1])。这种方法非常众所周知,很多研究正在推动设计更好,更快的排名标准。然而,如今,它同样众所周知,由于许多因素,结果的平面结果限制了精确答案的检索。首先,查询结果的相关性是主观和时变的概念,严格依赖于用户在其中制定查询的上下文。其次,越来越多的网络正在扩大候选查询答案的数量和异质性。第三,网络用户的耐心有限,因此它们通常只需查看十大结果。这种情况的净结果是,标准用户的正确答案的检索越来越困难,如果不是不可能的话。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号