首页> 外文会议>IEEE International Conference on Tools with Artificial Intelligence >Combining Statistics-Based and CNN-Based Information for Sentence Classification
【24h】

Combining Statistics-Based and CNN-Based Information for Sentence Classification

机译:结合基于统计信息和基于CNN的信息进行句子分类

获取原文

摘要

Sentence classification, serving as the foundation of the subsequent text-based processing, continues attracting researchers attentions. Recently, with the great success of deep learning, convolutional neural network (CNN), a kind of common architecture of deep learning, has been widely used to this filed and achieved excellent performance. However, most CNN-based studies focus on using complex architectures to extract more effective category information, requiring more time in training models. With the aim to get better performance with less time cost on classification, this paper proposes two simple and effective methods by fully combining information both extracted from statistics and CNN. The first method is S-SFCNN, which combines statistical features and CNN-based probabilistic features of classification to build feature vectors, and then the vectors are used to train the logistic regression classifiers. And the second method is C-SFCNN, which combines CNN-based features and statistics-based probabilistic features of classification to build feature vectors. In the two methods, the Naive Bayes log-count ratios are selected as the text statistical features and the single-layer and single channel CNN is used as our CNN architecture. The testing results executed on 7 tasks show that our methods can achieve better performance than many other complex CNN models with less time cost. In addition, we summarized the main factors influencing the performance of our methods though experiment.
机译:句子分类作为后续基于文本的处理的基础,继续引起研究人员的注意。近年来,随着深度学习的巨大成功,卷积神经网络(CNN)是一种深度学习的通用体系结构,已广泛应用于此领域并取得了出色的性能。但是,大多数基于CNN的研究都集中于使用复杂的体系结构来提取更有效的类别信息,这需要花费更多的时间来训练模型。为了以更少的时间花费来获得更好的性能,本文提出了两种简单有效的方法,将统计数据和CNN中提取的信息完全结合起来。第一种方法是S-SFCNN,它结合统计特征和基于CNN的分类概率特征来构建特征向量,然后将这些向量用于训练逻辑回归分类器。第二种方法是C-SFCNN,它结合了基于CNN的特征和基于统计量的概率分类特征,以构建特征向量。在这两种方法中,选择朴素贝叶斯对数计数比作为文本统计功能,并且将单层和单通道CNN用作我们的CNN体​​系结构。在7个任务上执行的测试结果表明,与许多其他复杂的CNN模型相比,我们的方法可以以更低的时间成本实现更好的性能。此外,我们通过实验总结了影响我们方法性能的主要因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号