首页> 外文期刊>Knowledge-Based Systems >Semi-supervised evolutionary ensembles for Web video categorization
【24h】

Semi-supervised evolutionary ensembles for Web video categorization

机译:用于网络视频分类的半监督进化合奏

获取原文
获取原文并翻译 | 示例

摘要

Evolutionary Algorithms (EA) have been developing rapidly as a powerful and general learning approach which has been used successfully to find a reasonable solution for data mining and knowledge discovery. Genetic algorithm (GA) is a kind of mainstream EA paradigm with a purpose of developing solutions for optimization problems. Clustering ensembles have emerged as an outstanding algorithm in machine learning to leverage the consensus across multiple clustering solutions and combines their predictions into a single solution with improved robustness, stability and accuracy. Multimedia advancement and popularity of the social Web has collectively provided an easy way to generate bulk of videos. Categorization of such Web videos has become a hot research challenge. In this paper, we propose a Semi-supervised Evolutionary Ensemble (SS-EE) framework for social media mining, e.g., Web Video Categorization (WVC), using their low cost textual features, intrinsic relations and extrinsic Web support. The contributions of this research work are as follows. First, we extend the traditional Vector Space Model (VSM) to Semantic VSM (S-VSM) by considering the semantic similarity between the feature terms using Normalized Google Distance (NGD) approach. Second, we define a new distance measure, Triangular Similarity (TrS) between two Textual Feature Vectors (TFV) based on the frequencies of most relevant terms in each category. Third, we iterate the clustering ensemble process with the help of GA guided by a new measure, Pre-Paired Percentage (PPP), to be used as the fitness function during the genetic cycle. Fourth, in the key steps of the GA, crossover and mutation genetic operators, we define them by an intelligent mechanism of clustering ensemble. Fifth, in order to terminate the genetic cycle, we define another new measure, Clustering Quality (Cq), based on similarity matrix and clustering labels. Experiments on real world social-Web data (YouTube) have been performed to validate the SS-EE framework.
机译:进化算法(EA)已迅速发展为一种强大而通用的学习方法,已成功用于为数据挖掘和知识发现找到合理的解决方案。遗传算法(GA)是一种主流的EA范式,旨在为优化问题开发解决方案。聚类集成已成为机器学习中一种出色的算法,可以利用多个聚类解决方案之间的共识,并将其预测合并为一个具有增强的鲁棒性,稳定性和准确性的解决方案。多媒体的进步和社交网络的普及共同提供了一种生成大量视频的简便方法。此类网络视频的分类已成为研究的热点。在本文中,我们提出了一种用于社交媒体挖掘的半监督进化集成(SS-EE)框架,例如Web视频分类(WVC),该框架使用了其低成本的文本功能,固有关系和外部Web支持。这项研究工作的贡献如下。首先,我们通过使用标准化Google距离(NGD)方法考虑特征项之间的语义相似性,将传统的向量空间模型(VSM)扩展到语义VSM(S-VSM)。其次,我们基于每个类别中最相关的词的频率定义一个新的距离度量,两个文本特征向量(TFV)之间的三角相似度(TrS)。第三,我们在遗传算法的帮助下,通过一种新的方法,即预先配对百分比(PPP),对聚类集成过程进行迭代,以用作遗传周期中的适应度函数。第四,在遗传算法,交叉和突变遗传算子的关键步骤中,我们通过聚类集成的智能机制对其进行定义。第五,为了终止遗传周期,我们基于相似度矩阵和聚类标记定义了另一种新的度量,即聚类质量(Cq)。已经对现实世界的社交网络数据(YouTube)进行了实验,以验证SS-EE框架。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号