首页> 外文会议>IEEE International Conference of Safety Produce Informatization >Clustering of Short Text in Micro-blog Based on K-means Algorithm

【24h】

Clustering of Short Text in Micro-blog Based on K-means Algorithm

机译：基于K-means算法的微博短文本聚类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Based on K-means algorithm, this paper proposed a short text clustering method. First of all, data of short texts on the Internet are collected by using the web crawler. Then, they are preprocessed, for example, irrelevant contents like noisy data, punctuation and stop words, are removed. After that, word segmentation is carried out on the preprocessed short texts, and distributed expression is carried out on the segmented words. Finally, these texts are clustered and sorted on the basis of K-means algorithm. According to the experiment results, methods put forward in the paper are appropriate for short text clustering.

机译：基于K-means算法，提出了一种短文本聚类方法。首先，使用Web搜寻器收集Internet上的短文本数据。然后，对它们进行预处理，例如，删除不相关的内容，例如嘈杂的数据，标点符号和停用词。之后，对预处理后的短文本进行分词，并对分段后的词进行分布式表达。最后，这些文本基于K-means算法进行聚类和排序。根据实验结果，本文提出的方法适用于短文本聚类。

著录项

来源
《IEEE International Conference of Safety Produce Informatization 》|2018年|812-815|共4页
会议地点 Chongqing(CN)
作者
Mao Xingliang; Li Fangfang;
展开▼
作者单位

College of Information System and Management National University of Defense Technology Changsha Hunan Province 410073 China;

School of Information Science and Engineering Central South University Changsha HunanProvince 410083 China;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
learning (artificial intelligence); pattern clustering; text analysis; Web sites;

机译：学习（人工智能）；模式聚类；文本分析；网站;

相似文献

外文文献
中文文献
专利

1. Micro-Blog Topic Detection Method Based on BTM Topic Model and K-Means Clustering Algorithm [J] . Weijiang Li, Yanming Feng, Dongjun Li, Automatic Control and Computer Sciences . 2016 ,第4期

机译：基于BTM主题模型和K-Means聚类算法的微博主题检测方法
2. ARABIC TEXT CLUSTERING BASED ON K-MEANS ALGORITHM WITH SEMANTIC WORD EMBEDDING [J] . HASNAA R. H. SOLIMAN, MOHAMED GRIDA, MOHAMED HASSAN Journal of Theoretical and Applied Information Technology . 2019 ,第21期

机译：基于K-Means算法的语义词嵌入阿拉伯语文本聚类
3. An Improved Clustering Algorithm for Text Mining: Multi-Cluster Spherical K-Means [J] . Tunali Volkan, Bilgin Turgay, Camurcu Ah The international arab journal of information technology . 2016 ,第1期

机译：一种改进的文本挖掘聚类算法：多簇球形K-均值
4. Clustering of Short Text in Micro-blog Based on K-means Algorithm [C] . Mao Xingliang, Li Fangfang IEEE International Conference on Safety Produce Informatization . 2018

机译：基于K均值算法的微博文本中的群集
5. A K-means based watershed imaging segmentation algorithm for banana cluster quality inspection. [D] . Castillo Cepin, Gregorio Alfonso. 2016

机译：基于K均值的分水岭成像分割算法用于香蕉簇质量检测。
6. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm Minimum Spanning Tree and Hierarchical Clustering in an Applied Study [O] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, 2020

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法最小生成树和分层聚类的三种混合方法的比较
7. Design and Application of a Text Clustering Algorithm Based on Parallelized K-Means Clustering [O] . Hui Wang, Chengdong Zhou, Leixiao Li 2019

机译：基于并行k均值聚类的文本聚类算法的设计与应用

Clustering of Short Text in Micro-blog Based on K-means Algorithm

摘要

著录项

相似文献

相关主题

期刊订阅