首页> 外文会议>International Conference on Computational Science and Its Applications;ICCSA 2008 >Table Based Single Pass Algorithm for Clustering News Articles in NewsPage.com
【24h】

Table Based Single Pass Algorithm for Clustering News Articles in NewsPage.com

机译:NewsPage.com中基于表格的单次通过算法可对新闻文章进行聚类

获取原文

摘要

This research proposes a modified version of single pass algorithm specialized for text clustering. Encoding documents into numerical vectors for using the traditional version of single pass algorithm causes the two main problems: huge dimensionality and sparse distribution. Therefore, in order to address the two problems, this research modifies the single pass algorithm into its version where documents are encoded into other forms than numerical vectors. In the proposed version, documents are mapped into tables and an operation on two tables is defined for using the single pass algorithm. The goal of this research is to improve the performance of single pass algorithm for text clustering by modifying it.
机译:这项研究提出了一种专门针对文本聚类的单次通过算法的改进版本。使用传统版本的单遍算法将文档编码为数值向量会导致两个主要问题:巨大的维数和稀疏的分布。因此,为了解决这两个问题,本研究将单遍算法修改为它的版本,在该版本中,文档被编码为数字矢量以外的其他形式。在建议的版本中,文档被映射到表中,并且使用单遍算法定义了对两个表的操作。这项研究的目的是通过对其进行改进,以提高用于文本聚类的单遍算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号