Experimenting SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets

机译：实验SnakeT：网页摘要的分层聚类引擎

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current search engines return a ranked list of web pages represented by page excerpts called the web snippets. The ranking is computed according to some relevance criterium that takes into account textual and hyperlink information about the web pages (see e.g. [1]). This approach is very well-known and a lot of research is pushing towards the design of better and faster ranking criteria. However, it is nowadays equally known that a flat list of results limits the retrieval of precise answers because of many factors. First, the relevance of the query results is a subjective and time-varying concept that strictly depends on the context in which the user is formulating the query. Second, the ever growing web is enlarging the number and heterogeneity of candidate query answers. Third, the web users have limited patience so that they usually just look at the top ten results. The net outcome of this scenario is that the retrieval of the correct answer by a standard user is getting more and more difficult, if not impossible. It is therefore not surprising that new IR tools are being designed to boost, or complement, the efficacy of search-engine ranking algorithms. These tools offer new ways of organizing and presenting the query results that are more intuitive and simple to be browsed, so that the users may match their needs faster. Among the various proposals, one became recently popular thanks to the engine Vivisimo (see Figure 1) that got in the last three years the "Best Metasearch Engine Award" by SearchEngineWatch.com.

机译：当前的搜索引擎返回由网页摘录表示的网页排名列表，称为网页摘要。根据考虑到有关网页的文本和超链接信息的一些相关性标准来计算排名（参见例如[1]）。这种方法是众所周知的，并且许多研究正在推动设计更好和更快的排名标准。但是，如今同样众所周知的是，由于许多因素，平坦的结果列表会限制对精确答案的检索。首先，查询结果的相关性是一个主观且随时间变化的概念，严格取决于用户制定查询的上下文。第二，不断发展的网络正在扩大候选查询答案的数量和种类。第三，网络用户的耐心有限，因此他们通常只查看前十名的结果。这种情况的最终结果是，即使不是不可能，标准用户检索正确答案的难度也越来越大。因此，设计新的IR工具来增强或补充搜索引擎排名算法的效果就不足为奇了。这些工具提供了组织和显示查询结果的新方式，这些方式更直观，更易于浏览，从而使用户可以更快地满足其需求。在各种建议中，最近一项受到引擎Vivisimo的欢迎（见图1），该引擎在过去三年中获得了SearchEngineWatch.com的“最佳Metasearch引擎奖”。

著录项

来源
《European Conference on Principles and Practice of Knowledge Discovery in Databases; 20040920-24; Pisa(IT)》|2004年|P.543-545|共3页
会议地点 Pisa(IT)
作者
Paolo Ferragina; Antonio Gulli;
展开▼
作者单位

Dipartimento di Informatica, Universita di Pisa;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. A personalized search engine based on Web-snippet hierarchical clustering [J] . P. Ferragina, A. Gulli Software . 2008,第2期

机译：基于Web片段层次聚类的个性化搜索引擎
2. A Non-Redundant Hierarchical Web Snippet Clustering System to Enhance WWW Search [J] . SHIHCHIEH CHOU, CHJENCHENG SUN, SZUJUI HUANG WSEAS Transactions on Information Science and Applications . 2007,第2期

机译：增强WWW搜索的非冗余分层Web代码段聚类系统
3. Chaotic cluster itinerancy and hierarchical cluster trees in electrochemical experiments [J] . Istvan Z. Kiss Chaos . 2003,第3期

机译：电化学实验中的混沌簇迭代和层次簇树
4. Experimenting SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets [C] . Paolo Ferragina, Antonio Gulli European Conference on Principles and Practice of Knowledge Discovery in Databases . 2004

机译：实验Snaket：用于网页代码段的分层聚类引擎
5. Pseudo-hierarchical ant-based clustering using a heterogeneous agent hierarchy and automatic boundary formation. [D] . Brown, Jeremy Bernard. 2009

机译：使用异构代理层次结构和自动边界形成的基于伪层次蚂蚁的聚类。
6. MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments [O] . Brian E. Vestal, Camille M. Moore, Elizabeth Wynn, 2020

机译：McMseq：聚类和重复测量RNA测序实验的贝叶斯分层建模
7. Experimenting Snaket: A hierarchical clustering engine for web-page snippets [O] . FERRAGINA P, GULLÌ A 2004

机译：实验性Snaket：用于网页摘要的分层群集引擎

Experimenting SnakeT: A Hierarchical Clustering Engine for Web-Page Snippets

摘要

著录项

相似文献

相关主题

期刊订阅