首页> 外文会议>6th International Conference on Computer Sciences and Convergence Information Technology. >Research on the parallel text clustering algorithm based on the semantic tree
【24h】

Research on the parallel text clustering algorithm based on the semantic tree

机译:基于语义树的并行文本聚类算法研究

获取原文
获取原文并翻译 | 示例

摘要

Since the semantic relationship between words is neglected, the results of the text clustering algorithms that only use word frequency are not precision. In this paper, a semantic tree based text clustering algorithm which is based on WordNet is proposed. In order to reduce the time complexity, we adopt parallel algorithm in multi-processes model. This parallel algorithm starts some processes at the same time. The master process undertakes the task of data partitioning, sending information, collecting information and clustering the result. The slave processes basically are in charge of statistics of word frequency, calculating the weights and getting hypernyms of some words according to the semantic tree. The results of experiment show that this algorithm is not only higher in precision, but also with lower time complexity.
机译:由于忽略了单词之间的语义关系,因此仅使用单词频率的文本聚类算法的结果并不精确。本文提出了一种基于WordNet的基于语义树的文本聚类算法。为了减少时间复杂度,我们在多进程模型中采用并行算法。该并行算法同时启动一些进程。主过程承担数据分区,发送信息,收集信息和结果聚类的任务。从属进程主要负责统计词频,根据语义树计算权重并得到某些词的上位词。实验结果表明,该算法不仅精度较高,而且时间复杂度较低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号