首页> 中文期刊> 《安徽职业技术学院学报》 >基于并行计算的概率潜在语义分析算法研究

基于并行计算的概率潜在语义分析算法研究

         

摘要

概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)中通过将文档-单词关系转变成文档-主题-单词关系对文档进行排序、过滤、分类等操作,计算量巨大。文章设计了基于MPI(Message Passing Interface)的PLSA高效并行方案,对模型系统和训练数据处理以及并行算法加以优化,提出了一种大数据条件下PLSA并行算法,解决了以往数据规模太大难以计算的问题,算法较优化前训练速度有较大提升,具有扩展性和可行性。%Probabilistic Latent Semantic Analysis (PLSA) is often used to turn the relationship of document-word into the relationship of document-theme-word so that documents can be sorted, filtered, classified and so on, which calls for large amount of calculation. In this paper an optimized algorithm for PLSA based on parallel computing in the big data environment is proposed. A model of PLSA is designed based on MPI (Message Passing Interface) and solves the problems that computation is difficult for the large data. The speed of optimized algorithm is raised with extensibility and feasibility.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号