首页> 外文会议>International Conference on Artificial Intelligence >SM based Operation for Specializing a Fast Clustering Algorithm for Text Clustering
【24h】

SM based Operation for Specializing a Fast Clustering Algorithm for Text Clustering

机译:基于SM基于文本群集的快速聚类算法的SM

获取原文

摘要

This research proposes a new strategy where documents are encoded into string vectors for text clustering and modified versions of single pass algorithms to be adaptable to string vectors. Traditionally, when the single pass algorithm is used for pattern clustering, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern clustering. For example, in text clustering, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In order to address the two problems, in this research, we encode full texts into string vectors, and apply single pass algorithm to string vectors for text clustering.
机译:本研究提出了一种新的策略,其中文档被编码为文本群集和修改版本的单传算法的串向量,以适应​​串向量。传统上,当单通算法用于模式聚类时,原始数据应编码为数字向量。根据图案聚类的给定应用区域,该编码可能很困难。例如,在文本聚类中,将作为原始数据的全文编码为数字矢量导致两个主要问题:巨大的维度和稀疏分布。为了解决这两个问题,在本研究中,我们将全文编码为字符串向量,并将单通算法应用于文本群集的字符串向量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号