首页> 外文会议>2011 6th International Conference for Internet Technology and Secured Transactions >Design and implementation of a web structure mining algorithm using breadth first search strategy for academic search application
【24h】

Design and implementation of a web structure mining algorithm using breadth first search strategy for academic search application

机译:基于广度优先搜索策略的网络结构挖掘算法的设计与实现

获取原文
获取原文并翻译 | 示例

摘要

This paper deals with Web Structure Mining, using the Breadth First Search strategy. While browsing the web, the user has to go through many pages of the Internet, filter data and download required information. This task of searching and downloading is time consuming. Sometimes the search queries call for specific option, say, limiting search to few links. To reduce the time spent by users, a web link extraction tool has been designed and implemented in Java, that analyzes the ways of extracting web link information using a standard interface. The Test Scenario has been presented with various keywords like Higher Education, Conference Alerts and Special Interest Group. The present work can be a useful input to Web Users, Faculty, Students and Web Administrators in a University Environment. The web extraction tool helps to save time in searching and downloading files from the web. Another strong requirement is to verify whether the search keywords which have been entered by the user, gives an user accurate and relevant results. This is made possible by performing a quick check on search links. The user can also view the internal links present in the selected HTML files and the adjacency list of the crawled files.
机译:本文使用广度优先搜索策略处理Web结构挖掘。在浏览Web时,用户必须浏览Internet的许多页面,过滤数据并下载所需的信息。搜索和下载的任务非常耗时。有时,搜索查询需要特定的选项,例如,将搜索限制为很少的链接。为了减少用户花费的时间,已使用Java设计并实现了Web链接提取工具,该工具分析了使用标准界面提取Web链接信息的方式。测试场景中包含了各种关键词,例如高等教育,会议警报和特殊兴趣小组。当前的工作可以为大学环境中的Web用户,教职员工,学生和Web管理员提供有用的输入。 Web提取工具有助于节省从Web搜索和下载文件的时间。另一个强烈的要求是验证用户输入的搜索关键字是否为用户提供准确和相关的结果。通过对搜索链接进行快速检查,可以做到这一点。用户还可以查看所选HTML文件中存在的内部链接以及已爬网文件的邻接列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号