首页> 外文会议>IEEE International Conference on Network Infrastructure and Digital Content >UGC quality evaluation based on meta-learning and content feature analysis
【24h】

UGC quality evaluation based on meta-learning and content feature analysis

机译:基于元学习和内容特征分析的UGC质量评估

获取原文

摘要

With the fast development of Social Networking Services, there has been increasingly vast amount of information published by massive network users. Given this information explosion, how to analyze the quality of User Generated Contents (UGC) automatically becomes a challenging task for researchers. To solve the problem, we need to build an effective UGC quality evaluation system. In the light of our experience, we believe that the textual content of UGC is the key factor for its quality. Hence, we focus on textual content based quality evaluation and classification instead of using UGC publishing related data, such as times being commented and forwarded in this paper. We extract various features of the textual contents based on natural language processing technologies firstly, such as word segmentation, keywords, topic model, sentence parsing, distributed word representation etc. Secondly, we build several base-learning classifiers with different features and different machine learning algorithms to assign UGC contents with four different quality labels. Then, we create the global meta-learning model based on these base classifiers to generate the final quality labels for UGC contents. We have also implemented a series of experiments based on realistic data collected from Tianya Forum and use 10-fold cross-validation to test the model. Results have shown that our proposed meta-learning model performs much better.
机译:随着社交网络服务的快速发展,越来越多的大量网络用户发布的信息。鉴于此信息爆炸,如何分析用户生成的内容(UGC)的质量,自动成为研究人员的具有挑战性的任务。为了解决问题,我们需要建立一个有效的UGC质量评估系统。鉴于我们的经验,我们认为,UGC的文本内容是其质量的关键因素。因此,我们专注于基于文本内容的质量评估和分类,而不是使用UGC发布相关数据,例如在本文中正在评论和转发的时间。我们首先提取基于自然语言处理技术的文本内容的各种特征,例如Word分段,关键字,主题模型,句子解析,分布式字表示等。其次,我们构建了具有不同特征和不同机器学习的几个基本学习分类器分配具有四个不同质量标签的UGC内容的算法。然后,我们根据这些基本分类器创建全局元学习模型,以为UGC内容生成最终质量标签。我们还基于从天与论坛收集的现实数据实施了一系列实验,并使用10倍交叉验证来测试模型。结果表明,我们所提出的元学习模式表现得更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号