首页> 外文期刊>PLoS One >Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach
【24h】

Enhancing web search result clustering model based on multiview multirepresentation consensus cluster ensemble (mmcc) approach

机译:基于MultiView多重特派复断的共识群组(MMCC)方法,增强基于MultiView Multimirepration的群集群集模型

获取原文
       

摘要

Existing text clustering methods utilize only one representation at a time (single view), whereas multiple views can represent documents. The multiview multirepresentation method enhances clustering quality. Moreover, existing clustering methods that utilize more than one representation at a time (multiview) use representation with the same nature. Hence, using multiple views that represent data in a different representation with clustering methods is reasonable to create a diverse set of candidate clustering solutions. On this basis, an effective dynamic clustering method must consider combining multiple views of data including semantic view, lexical view (word weighting), and topic view as well as the number of clusters. The main goal of this study is to develop a new method that can improve the performance of web search result clustering (WSRC). An enhanced multiview multirepresentation consensus clustering ensemble (MMCC) method is proposed to create a set of diverse candidate solutions and select a high-quality overlapping cluster. The overlapping clusters are obtained from the candidate solutions created by different clustering methods. The framework to develop the proposed MMCC includes numerous stages: (1) acquiring the standard datasets (MORESQUE and Open Directory Project-239), which are used to validate search result clustering algorithms, (2) preprocessing the dataset, (3) applying multiview multirepresentation clustering models, (4) using the radius-based cluster number estimation algorithm, and (5) employing the consensus clustering ensemble method. Results show an improvement in clustering methods when multiview multirepresentation is used. More importantly, the proposed MMCC model improves the overall performance of WSRC compared with all single-view clustering models.
机译:现有文本群集方法一次仅使用一个表示(单视图),而多个视图可以表示文档。 MultiView多重特性方法提高了聚类质量。此外,现有的聚类方法,用于一次使用多个表示(MultiView)使用具有相同性质的表示。因此,使用多个视图,该视图表示与群集方法不同表示的数据是合理的,可以创建多样化的候选聚类解决方案。在此基础上,有效的动态聚类方法必须考虑组合多个数据视图,包括语义视图,词汇视图(Word加权)和主题视图以及群集数。本研究的主要目标是开发一种新方法,可以提高Web搜索结果集群(WSRC)的性能。提出了一个增强的多视图多重特殊共识群集集群集群(MMCC)方法,以创建一组不同的候选解决方案,并选择高质量的重叠群集。从不同聚类方法创建的候选解决方案获得重叠群集。开发建议的MMCC的框架包括许多阶段:(1)获取标准数据集(MoreSque和Open Director Project-239),用于验证搜索结果集群算法,(2)预处理数据集(3)应用MultiView多种特征聚类模型,(4)使用基于RADIUS的簇数估计算法,(5)采用共识群集集群方法。结果显示使用多视图多重特性时的聚类方法的改进。更重要的是,与所有单视聚类模型相比,建议的MMCC模型提高了WSRC的整体性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号