首页> 外文期刊>Library & Information Science Research >A generic lexical URL segmentation framework for counting links, colinks or URLs
【24h】

A generic lexical URL segmentation framework for counting links, colinks or URLs

机译:通用的词法URL分段框架,用于计算链接,共链接或URL

获取原文
获取原文并翻译 | 示例
           

摘要

Large sets of Web page links, colinks, or URLs sometimes need to be counted or otherwise summarized by researchers to analyze Web growth or publishing. Computing professionals also use them to evaluate Web sites or optimize search engines. Despite the apparently simple nature of these types of data, many different summarization methods have been used in the past. Some of these methods may not have been optimal. This article proposes a generic lexical framework to unify and extend existing methods through abstract notions of link lists and URL lists. The approach is built upon decomposing URLs by lexical segments, such as domain names, and systematically characterizing the counting options available. In addition, counting method choice recommendations are inferred from a very general set of theoretical research assumptions. The article also offers practical advice for analyzing raw data from search engines. (C) 2008 Elsevier Inc. All rights reserved.
机译:研究人员有时需要对大型的Web页面链接,共链接或URL进行计数或汇总,以分析Web的增长或发布。计算专业人员还使用它们来评估网站或优化搜索引擎。尽管这些类型的数据看似简单,但过去已经使用了许多不同的汇总方法。其中一些方法可能不是最佳方法。本文提出了一个通用的词汇框架,通过链接列表和URL列表的抽象概念来统一和扩展现有方法。该方法建立在按词汇段(例如域名)分解URL并系统地表征可用计数选项的基础上。此外,计数方法选择建议是从一组非常普遍的理论研究假设中得出的。本文还为分析来自搜索引擎的原始数据提供了实用建议。 (C)2008 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号