首页> 外文会议>2011 Eighth Web Information Systems and Applications Conference >Dynamic Splog Filtering algorithm Based on Combinational Features
【24h】

Dynamic Splog Filtering algorithm Based on Combinational Features

机译:基于组合特征的动态Splog过滤算法

获取原文
获取原文并翻译 | 示例

摘要

This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social communication mechanisms. The existing algorithms of identifying splogs based on lexical frequency features which are quite redundancy and lack correlation, degrades blog search results as well as wastes network resources. In our approach we exploit a dynamic filtering algorithm based on the combinational features of splog(CFDS) to detect splogs. CFDS algorithm selects several efficient novel features such as self- similarity features and the attributes of author to take place of the larger redundant lexical frequency features. Moreover, we extract a content based feature vector from different parts of the biog. The dimensionality of the feature vector is reduced by ECE (Expected Cross Entropy) evaluation criterion. We have tested an SVM based splog detector using combinational features on the standard datasets, with excellent filtering efficiency.
机译:本文重点讨论垃圾邮件博客(splog)的检测。博客是非常流行的新媒体社交沟通机制。现有的基于词频特征的slogs识别算法具有很高的冗余度,缺乏相关性,会降低博客搜索结果的质量,并浪费网络资源。在我们的方法中,我们利用基于splog(CFDS)组合特征的动态过滤算法来检测splog。 CFDS算法选择一些有效的新颖特征(例如自相似特征和作者的属性)来代替较大的冗余词频特征。此外,我们从biog的不同部分提取基于内容的特征向量。特征向量的维数通过ECE(期望交叉熵)评估标准降低。我们已经使用标准数据集上的组合功能测试了基于SVM的堵塞检测器,具有出色的过滤效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号