首页> 外文期刊>Procedia Computer Science >Multi label text classification method based on co-occurrence latent semantic vector space
【24h】

Multi label text classification method based on co-occurrence latent semantic vector space

机译:基于共现潜在语义向量空间的多标签文本分类方法

获取原文
           

摘要

Aiming at the problem of conceptual ambiguity and underlying semantic structure of multi label text categorization, an ensemble classification method is proposed, which combines random forest (RF) algorithm and semantic core co-occurrence latent semantic vector space (CLSVSM). Through the random segmentation of words, the diversity of integration is increased, and the different orthogonal projection of low dimensional implicit semantic space is obtained. Random forest can effectively solve binary classification problem, and implicit semantics reveals the underlying semantic structure of text. The combination of them can represent the diversity and accuracy of individuals. The experimental results on Yahoo dataset demonstrate the effectiveness of the proposed method, which is superior to other methods in Hamming loss, coverage, first error and average accuracy.
机译:针对多标签文本分类概念上的概念模糊性和潜在语义结构问题,提出一种结合随机森林算法和语义核心共现潜在语义向量空间(CLSVSM)的集成分类方法。通过单词的随机分割,增加了集成的多样性,并获得了低维隐含语义空间的不同正交投影。随机森林可以有效地解决二进制分类问题,并且隐式语义揭示了文本的潜在语义结构。它们的组合可以代表个体的多样性和准确性。在Yahoo数据集上的实验结果证明了该方法的有效性,在汉明损失,覆盖率,首次误差和平均准确度方面优于其他方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号