首页> 外文会议>IEEE Congress on Evolutionary Computation >Using semantic similarity matrix for defining operations involved in NTSO for clustering 20NewsGroups
【24h】

Using semantic similarity matrix for defining operations involved in NTSO for clustering 20NewsGroups

机译:使用语义相似性矩阵定义NTSO中涉及的操作以对20NewsGroups进行聚类

获取原文

摘要

In this research, we propose the similarity matrix based version of NTSO as the approach to the text clustering. For using one of traditional approaches to text clustering, documents should be encoded into numerical vectors; encoding so causes the two main problems: the huge dimensionality and the sparse distribution. In order to solve the problems, in this research, we propose to encode documents into string vectors and use the NTSO (Neural Text Self Organization) as the string vector based neural network for the text clustering. By encoding documents into another form, we attempt to avoid the two main problems, completely. As the empirical validation, the proposed approach will be compared with others with respect to the clustering performance and speed.
机译:在这项研究中,我们提出了基于相似度矩阵的NTSO版本作为文本聚类的方法。为了使用一种传统的文本聚类方法,应将文档编码为数值向量;因此,编码会引起两个主要问题:巨大的维数和稀疏的分布。为了解决这些问题,在本研究中,我们建议将文档编码为字符串向量,并使用NTSO(神经文本自组织)作为基于字符串向量的神经网络进行文本聚类。通过将文档编码为另一种形式,我们尝试完全避免两个主要问题。作为经验验证,所提出的方法将在聚类性能和速度方面与其他方法进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号