首页> 中文期刊> 《计算机应用研究》 >多变参p LS I文本敏感特征抽取算法

多变参p LS I文本敏感特征抽取算法

         

摘要

敏感词等特征的抽取是社交网络敏感话题分析的关键环节。目前热门的概率主题模型在社交网络敏感话题分析领域,受到特征语义复杂以及高噪声的影响,处理性能不够理想。提出了一种多变参概率潜在语义索引(pLSI)算法,可以利用社交网站标签、文本表情图片等多种辅助信息提高特征抽取的效果。实验数据显示,该算法有较高的分类准确率和较低的时间开销。该算法是理想的降维算法,适用于社交网络的敏感特征抽取。%Sensitive features extraction is a key issue of sensitive topic analysis in social networks.When face to the sensitive topic analysis task in social networks,the performance of probabilistic topic models are not ideal because too many noises are existed and the sensitive features always have high semantic complexity.This paper proposed a multi variables pLSA(probabi-listic latent semantic indexing)algorithm which could use tag words and emoticons icons to improve the precision of feature ex-traction.Experimental result shows that the proposed algorithm has high precision and low time consumption.The novel method is an ideal dimension reduction tools and suitable for sensitive features extraction in social networks.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号