Improving text classification with weighted word embeddings via a multi-channel TextCNN model

Guo Bao; Zhang Chunxia; Liu Junmin; Ma Xiaoyi

首页> 外文期刊>Neurocomputing >Improving text classification with weighted word embeddings via a multi-channel TextCNN model

【24h】

Improving text classification with weighted word embeddings via a multi-channel TextCNN model

机译：通过多通道TextCNN模型，通过加权词嵌入来改善文本分类

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In recent years, convolutional neural networks (CNNs) have gained considerable attention in text classification because of the remarkable good performance they achieved in various situations. The usual practice is to first perform word embedding (i.e., mapping each word into a word vector), and then employ a CNN to perform classification. To improve classification accuracy, term weighting approaches have been proven to be quite effective. But to the best of our knowledge, almost all these methods assign only one weight to each term (word). Considering the fact that one term generally has different importance in documents with different class labels, we propose in this paper a novel term weighting scheme to be combined with word embeddings to enhance the classification performance of CNNs. In the novel method, multiple weights are assigned to each term and these weights are applied to the word embeddings of the words separately. Subsequently, the transformed features are fed into a multi-channel CNN model to predict the label of the sentence. By comparing the novel method with several other baseline methods with five benchmark data sets, the results manifest that the classification accuracy of the proposed method exceeds that of other methods by an amazing margin. Moreover, the weights assigned by different weighting schemes are also analyzed to get more insights of their working mechanism. (C) 2019 Elsevier B.V. All rights reserved.

机译：近年来，卷积神经网络（CNN）在文本分类中受到了广泛的关注，因为它们在各种情况下均具有出色的性能。通常的做法是先执行单词嵌入（即，将每个单词映射到单词向量中），然后使用CNN进行分类。为了提高分类的准确性，术语加权方法已被证明是非常有效的。但是据我们所知，几乎所有这些方法都只为每个术语（单词）分配一个权重。考虑到一个术语在具有不同类别标签的文档中通常具有不同的重要性这一事实，我们在本文中提出了一种新颖的术语加权方案，将其与单词嵌入相结合以增强CNN的分类性能。在该新颖方法中，将多个权重分配给每个术语，并将这些权重分别应用于单词的单词嵌入。随后，将经过转换的特征输入多通道CNN模型中，以预测句子的标签。通过将新方法与具有五个基准数据集的其他几种基线方法进行比较，结果表明，该方法的分类准确性比其他方法的分类准确性高出惊人。此外，还分析了不同加权方案分配的权重，以更深入地了解其工作机制。（C）2019 Elsevier B.V.保留所有权利。

著录项

来源
《Neurocomputing》 |2019年第21期|366-374|共9页
作者
Guo Bao; Zhang Chunxia; Liu Junmin; Ma Xiaoyi;
展开▼
作者单位

Xi An Jiao Tong Univ Sch Math & Stat Xian 710049 Shaanxi Peoples R China;

Univ Colorado Sch Art & Sci Boulder CO 80310 USA;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Text classification; Term weighting; Word embedding; Convolutional neural network; Term frequency-inverse document frequency (TF-IDF);

机译：文字分类;期限加权;词嵌入;卷积神经网络术语频率与文档频率成反比（TF-IDF）;

相似文献

外文文献
中文文献
专利

1. Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification [J] . Physica, A. Statistical mechanics and its applications . 2020,第期

机译：使用预训练的单词嵌入在土耳其语文本分类的深神经网络上使用预先训练的单词嵌入来提高准确性
2. Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification [J] . Wang Peng, Xu Bo, Xu Jiaming, Neurocomputing . 2016,第JANa22PTaB期

机译：使用词嵌入聚类和卷积神经网络进行语义扩展以改善短文本分类
3. Research on improved text classification method based on combined weighted model [J] . Wang Yongchang, Zhu Ligu Concurrency, practice and experience . 2020,第6期

机译：基于组合加权模型的改进文本分类方法研究
4. A Weighted Word Embedding Model for Text Classification [C] . Haopeng Ren, ZeQuan Zeng, Yi Cai, International conference on database systems for advanced applications . 2019

机译：文本分类的加权词嵌入模型
5. Things and Strings and More: Improving Place Name Disambiguation from Short Texts by Combining Entity Co-Occurrence, Topic Modeling, and Word Embedding [D] . Ju, Yiting. 2017

机译：事物和字符串和更多：通过组合实体共同发生，主题建模和单词嵌入来改善从短文本的歧义
6. Application of an emotional classification model in e-commerce text based on an improved transformer model [O] . Xuyang Wang, Yixuan Tong 2021

机译：一种情绪分类模型在基于改进变压器模型的电子商务文本中的应用
7. Improving text classification with word embedding [O] . Lihao Ge, Teng-Sheng Moh 2017

机译：用词嵌入改善文本分类

Improving text classification with weighted word embeddings via a multi-channel TextCNN model

摘要

著录项

相似文献

相关主题

期刊订阅