首页> 外国专利> Method for estimating coverage of Web search engines

Method for estimating coverage of Web search engines

机译:估计网络搜索引擎覆盖率的方法

摘要

A computerized method is used to estimate the relative coverage of Web search engines. Each search engine maintains an index of words of pages located at specific URL addresses in a network. The method generates a random query. The random query is a logical combination of words found in a subset of the pages. The random query is submitted to a first search engine. In response a set of URLs of pages matching the query are received. Each URL identifies a page indexed by the first search engine that satisfies the random query. A particular URL identifying a sample page is randomly selected. A strong query corresponding to the sample page is generated, and the strong query is submitted to a second search engine. Result information received in response to the strong query is compared to determine if the second search engine has indexed the sample page, or a page substantially similar to the sample page. This procedure is repeated to gather statistical data which is used to estimate the relative sizes and amount of overlap of search engines.
机译:使用一种计算机化的方法来估计Web搜索引擎的相对覆盖范围。每个搜索引擎维护位于网络中特定URL地址的页面单词索引。该方法生成随机查询。随机查询是在页面子集中找到的单词的逻辑组合。随机查询被提交给第一搜索引擎。作为响应,接收与查询匹配的一组页面URL。每个URL标识由第一个搜索引擎索引的,满足随机查询的页面。随机选择标识样本页面的特定URL。生成对应于示例页面的强查询,并将强查询提交给第二搜索引擎。比较响应于强查询而接收到的结果信息,以确定第二搜索引擎是否已索引样本页面或与样本页面基本相似的页面。重复此过程以收集统计数据,该统计数据用于估计搜索引擎的相对大小和重叠量。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号