首页> 外文会议>IEEE International Conference on Cloud Computing and Big Data Analysis >Chinese Weibo sentiment analysis based on character embedding with dual-channel convolutional neural network
【24h】

Chinese Weibo sentiment analysis based on character embedding with dual-channel convolutional neural network

机译:基于角色嵌入双通道卷积神经网络的中国微博情感分析

获取原文

摘要

As one of the most compelling NLP(Natural Language Processing) tasks, sentiment analysis becomes more and more popular. In this paper, we have proposed a new method of sentiment analysis by using pre-trained character embedding with a dual-channel convolutional neural network (char-DCCNN) to comprehend the sentiment of Sina Weibo's Chinese short comments. First of all, we divide Chinese corpus into single Chinese characters which are then trained as character vectors. Characters that appear less frequently are randomly initialized. Then, the vector matrix representing the text is input into a two-channel convolutional neural network. The vector of one channel remains static (as a kind of global feature) and another is fine-tuned (as a kind of local feature) according to the input data. Finally we record the train performance, validation performance and the final average validation performance respectively to reflect the sentiment classification results. The dataset used in this paper is NLPCC2012 micro-blog sentiment analysis datasets and the reviews of sports, film, social and other fields crawled from Sina Weibo. The 10-fold cross-validation will be employed. Three experiments are done to show the good performance of our method through three aspects - embedding, activation function, channel. In all, the char-DCCNN that we have put forward improves the sentiment classification results of Weibo Chinese short comments and possesses practical significance.
机译:作为最引人注目的NLP(自然语言处理)任务之一,情绪分析变得越来越受欢迎。在本文中,我们提出了通过使用双通道卷积神经网络(Char-DCCNN)的预训练字符嵌入来封闭的新的情感分析方法,以了解新浪微博的中国短评论的情绪。首先,我们将中国语料库划分为单一汉字,然后被培训为字符向量。随机初始化常用的字符较少。然后,表示文本的矢量矩阵被输入到双通道卷积神经网络中。根据输入数据,一个频道的向量保持静止(作为全局特征),另一个是微调(作为本地特征)的微调。最后,我们录制了列车性能,验证性能和最终平均验证性能,以反映了情绪分类结果。本文使用的数据集是NLPCC2012微博情感分析数据集和来自新浪微博爬行的体育,电影,社会和其他领域的审查。将采用10倍的交叉验证。完成了三个实验,以通过三个方面 - 嵌入,激活函数,通道显示我们方法的良好表现。总而言之,我们提出的Char-DCCNN改善了微博中国短意见的情感分类结果,并具有现实意义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号