首页> 外文期刊>Journal of information and computational science >Text Classification Model Based on Semantic Pattern Vector Space
【24h】

Text Classification Model Based on Semantic Pattern Vector Space

机译:基于语义模式向量空间的文本分类模型

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The existing text classification systems are mostly based on the bag of words vector space model (BOW-VSM) and classify relevant documents by representing them as the bag of words vector. The BOW-VSM-based text classification systems, in spite of their merits of classification velocity and efficiency, have failed to represent the complete semantics contained in the document contents and have led to the higher vector space dimensionality and the lower classification performance. Generally, a sentence is the smallest semantic unit of representing an integrated semantics. So we take the sentences of documents as the smallest semantic disposal unit and construct the sentence-level semantic patterns, consequently, forming the semantic pattern vector space model (SP-VSM) of documents. Then on which we base to design and implement an automatic text classification system. Finally, we cany out a contrast experiment to the classification performance of two kinds of classifiers with two different vector space models. The experimental results show that the classification performance of the SP-VSM-based text classification system has a greater improvement than that of the BOW-VSM-based.
机译:现有的文本分类系统主要基于词袋向量空间模型(BOW-VSM),并通过将相关文档表示为词袋向量来对相关文档进行分类。尽管基于BOW-VSM的文本分类系统具有分类速度和效率高的优点,但它们无法表示文档内容中包含的完整语义,导致矢量空间维数较高,分类性能较低。通常,句子是表示集成语义的最小语义单元。因此,我们将文档的句子作为最小的语义处理单元,构造句子级的语义模式,从而形成文档的语义模式向量空间模型(SP-VSM)。然后我们以此为基础来设计和实现一个自动文本分类系统。最后,我们对两种不同向量空间模型的两种分类器的分类性能进行了对比实验。实验结果表明,基于SP-VSM的文本分类系统的分类性能比基于BOW-VSM的文本分类系统有更大的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号