Research on the parallel text clustering algorithm based on the semantic tree

机译：基于语义树的并行文本聚类算法研究

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Since the semantic relationship between words is neglected, the results of the text clustering algorithms that only use word frequency are not precision. In this paper, a semantic tree based text clustering algorithm which is based on WordNet is proposed. In order to reduce the time complexity, we adopt parallel algorithm in multi-processes model. This parallel algorithm starts some processes at the same time. The master process undertakes the task of data partitioning, sending information, collecting information and clustering the result. The slave processes basically are in charge of statistics of word frequency, calculating the weights and getting hypernyms of some words according to the semantic tree. The results of experiment show that this algorithm is not only higher in precision, but also with lower time complexity.

机译：由于忽略了单词之间的语义关系，因此仅使用单词频率的文本聚类算法的结果并不精确。本文提出了一种基于WordNet的基于语义树的文本聚类算法。为了减少时间复杂度，我们在多进程模型中采用并行算法。该并行算法同时启动一些进程。主过程承担数据分区，发送信息，收集信息和结果聚类的任务。从属进程主要负责统计词频，根据语义树计算权重并得到某些词的上位词。实验结果表明，该算法不仅精度较高，而且时间复杂度较低。

著录项

来源
《6th International Conference on Computer Sciences and Convergence Information Technology.》|2011年|p.400- 403|共4页
会议地点 Seogwipo(KR)
作者
Liu Gangfeng; Wang Yunlan; Zhao Tianhai; Li Dongyang;
展开▼
作者单位

Center for High Performance Computing, Northwestern Polytechnical University, Xi'an;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. ARABIC TEXT CLUSTERING BASED ON K-MEANS ALGORITHM WITH SEMANTIC WORD EMBEDDING [J] . HASNAA R. H. SOLIMAN, MOHAMED GRIDA, MOHAMED HASSAN Journal of Theoretical and Applied Information Technology . 2019,第21期

机译：基于K-Means算法的语义词嵌入阿拉伯语文本聚类
2. High-Dimensional Text Datasets Clustering Algorithm Based on Cuckoo Search and Latent Semantic Indexing [J] . Saida Ishak Boushaki, Nadjet Kamel, Omar Bendjeghaba Journal of information & knowledge management . 2018,第3期

机译：基于Cuckoo搜索和潜在语义索引的高维文本数据集聚类算法
3. High-Dimensional Text Datasets Clustering Algorithm Based on Cuckoo Search and Latent Semantic Indexing [J] . Saida Ishak Boushaki, Nadjet Kamel, Omar Bendjeghaba Journal of information & knowledge management . 2018,第3期

机译：基于Cuckoo搜索和潜在语义索引的高维文本数据集聚类算法
4. Research on the parallel text clustering algorithm based on the semantic tree [C] . Liu Gangfeng, Wang Yunlan, Zhao Tianhai, International Conference on Computer Sciences and Convergence Information Technology . 2011

机译：基于语义树的并行文本聚类算法研究
5. Parallel implementation and benchmarking in cluster architectures of one-dimensional discrete fourier transforms: A comparison using the row-column algorithm versus a novel formulation based on the bluestein/pseudocirculant algorithm. [D] . Velez Rodriguez, William. 2014

机译：一维离散傅里叶变换的群集体系结构中的并行实现和基准测试：使用行列算法与基于bluestein / pseudocirculant算法的新颖公式进行比较。
6. Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets [O] . D. D. Shrimankar, S. R. Sathe 2016

机译：大型生物数据集基于新图块的并行编程模型对SMP节点和工作站集群的并行算法进行分析
7. Design and Application of a Text Clustering Algorithm Based on Parallelized K-Means Clustering [O] . Hui Wang, Chengdong Zhou, Leixiao Li 2019

机译：基于并行k均值聚类的文本聚类算法的设计与应用
8. Parallel Formulations of Tree-Projection Based Sequence Mining Algorithms. [R] . Guralnik, V., Karypis, G. 2003

机译：基于树投影的序列挖掘算法的并行公式。

Research on the parallel text clustering algorithm based on the semantic tree

摘要

著录项

相似文献

相关主题

期刊订阅