【24h】

An efficient clustering framework for relevant web information

机译:有关相关Web信息的有效集群框架

获取原文

摘要

As the amount of available information on the Internet grows, it is becoming increasingly difficult for users to find information that is relevant to their needs. Against this backdrop, a need for an automated tool that can find information quickly and easily has surfaced. In this paper, we propose a Clustering Framework for crawling and clustering the necessary information from Web pages. The proposed clustering framework consists of three modules: a preprocessing module, clustering module and community module. Using this framework, we are able to automatically cluster Web pages according to topic and rank them in terms of relevance. We describe this framework, and show the results of our preliminary validation work.
机译:随着Internet上可用信息量的增长,用户查找与他们的需求相关的信息变得越来越困难。在这种背景下,对能够快速,轻松地找到信息的自动化工具的需求浮出水面。在本文中,我们提出了一个群集框架,用于对网页中的必要信息进行爬网和群集。提议的集群框架包括三个模块:预处理模块,集群模块和社区模块。使用此框架,我们能够根据主题自动对网页进行聚类,并根据相关性对它们进行排名。我们描述了此框架,并显示了我们初步验证工作的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号