首页> 外文期刊>International Journal of Engineering and Technology >Relation Based Mining Model for Enhancing Web Document Clustering
【24h】

Relation Based Mining Model for Enhancing Web Document Clustering

机译:基于关系的挖掘模型用于增强Web文档聚类

获取原文
           

摘要

The design of web Information management system becomes more complex one with more time complexity. Information retrieval is a difficult task due to the huge volume of web documents. The way of clustering makes the retrieval easier and less time consuming. Thisalgorithm introducesa web document clustering approach, which use the semantic relation between documents, which reduces the time complexity. It identifies the relations and concepts in a document and also computes the relation score between documents. This algorithm analyses the key concepts from the web documents by preprocessing, stemming, and stop word removal. Identified concepts are used to compute the document relation score and clusterrelation score. The domain ontology is used to compute the document relation score and cluster relation score. Based on the document relation score and cluster relation score, the web document cluster is identified. This algorithm uses 2,00,000 web documents for evaluation and 60 percentas trainingset and 40 percent as testing set.
机译:Web信息管理系统的设计变得越来越复杂,时间越来越复杂。由于大量的Web文档,信息检索是一项艰巨的任务。聚类的方式使检索更加容易且耗时更少。该算法引入了一种Web文档聚类方法,该方法利用文档之间的语义关系来降低时间复杂度。它可以识别文档中的关系和概念,还可以计算文档之间的关系得分。该算法通过预处理,词干提取和停用词分析来分析Web文档中的关键概念。所标识的概念用于计算文档关联度和聚类关联度。领域本体用于计算文档关联度和聚类关联度。基于文档关系分数和聚类关系分数,识别Web文档聚类。该算法使用2,00,000个Web文档进行评估,其中60%作为培训集,40%作为测试集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号