首页> 外文会议>Advanced Workshop on Content Computing >Generating Different Semantic Spaces for Document Classification
【24h】

Generating Different Semantic Spaces for Document Classification

机译:为文档分类生成不同的语义空间

获取原文

摘要

Document classification is an important technique in the field of digital library, WWW pages etc. Due to the problems of synonymy and polysemy, it is better to classify documents based on latent semantics. The local semantic basis, which contains the features of documents within a particular category, has more discriminate power and is more effective in classification than global semantic basis which contains the common features of all documents available. Because the semantic basis obtained by Nonnegative matrix factorization has a straightforward correspondence with samples while the semantic basis obtained by Singular value decomposition doesnt, NMF is suitable to obtain the local semantic basis. In this paper, global and local semantic bases obtained by SVD and NMF are compared. The experimental results show that the best classification accuracy is achieved by local semantic basis obtained by NMF.
机译:文档分类是数字图书馆领域的重要技术,www页面等由于同义词和多士密化的问题,最好基于潜在语义来分类文档。 本地语义基础,其中包含特定类别中的文档的功能,具有比全局语义基础更有效的权力,并且在分类中更有效,其中包含可用的所有文档的共同功能。 由于非负矩阵分解获得的语义基础与样本具有直截了当的对应关系,而单数值分解而获得的语义基于NOT,则NMF适合于获得局部语义基础。 在本文中,比较了通过SVD和NMF获得的全局和局部语义基础。 实验结果表明,最佳分类精度是通过NMF获得的局部语义基础实现的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号