首页> 外文会议>International Conference on Issues and Challenges in Intelligent Computing Techniques >Analysis for classification of similar documents among various websites using rapid miner
【24h】

Analysis for classification of similar documents among various websites using rapid miner

机译:利用快速矿工分析各种网站中类似文件的分类

获取原文

摘要

The Web was intended to improve the management of general information about accelerators and experiments. It is also considered the most precious place for Information Retrieval and Knowledge Discovery. While retrieving information through queries inserted by the users, a search engine results in a large and non manageable collection of documents. Several web mining tools are used to classify, analyse and order the documents so that users can easily navigate through the search results and find the desired documents. A more efficient way to organize the documents can be a combination of similarity and ranking, where similarity can group the documents in terms of contents or distance and ranking can be applied for ordering the pages within each cluster or set. Based on this approach, in this paper, an analysis is being shown that provides ordered results in the form of similar documents among several set of website which are of users interest using an open source web mining tool called as rapid miner. This approach helps user to restrict their search to navigate less number of pages instead of huge documents in particular which are of their interest.
机译:该网站旨在改善关于加速器和实验的一般信息管理。它也被认为是信息检索和知识发现的最宝贵的地方。在通过用户插入的查询检索信息时,搜索引擎会导致大型和不可管理的文档集合。若干网络挖掘工具用于对文档进行分类,分析和订购文档,以便用户可以轻松地浏览搜索结果并找到所需的文档。组织文档的一种更有效的方式可以是相似性和排名的组合,其中相似性可以在内容或距离和距离和排名中对文档进行分组,以便在每个群集中或集合中排序页面。在本文的基础上,在本文中,正在显示分析,其中通过使用一个名为Rapid Miner的开源Web挖掘工具的多组网站中的几组网站中的类似文件的形式提供了有序结果。这种方法有助于用户限制他们的搜索来导航较少数量的页面而不是巨大的文档,特别是它们的兴趣。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号