首页> 外文期刊>Engineering journal >EmoCNN: Encoding Emotional Expression from Text to Word Vector and Classifying Emotions—A Case Study in Thai Social Network Conversation
【24h】

EmoCNN: Encoding Emotional Expression from Text to Word Vector and Classifying Emotions—A Case Study in Thai Social Network Conversation

机译:Emocnn:从文本到文字传染媒介和分类情绪编码情绪表达 - 以泰国社交网络对话为例

获取原文
获取外文期刊封面目录资料

摘要

We present EmoCNN, a collection of specially-trained word embedding layer and convolutional neural network model for the classification of conversational texts into 4 types of emotion. This model is part of a chatbot for depression evaluation. The difficulty in classifying emotion from conversational text is that most word embeddings are trained with emotionally-neutral corpus such as Wikipedia or news articles, where emotional words do not appear very often or at all, and the language style is formal writing. We trained a new word embedding based on the word2vec architecture in an unsupervised manner and then fine-tuned it on soft-labelled data. The data was obtained from mining Twitter using emotion keywords. We show that this emotion word embedding can differentiate between words which have the same polarity and words which have opposite polarity, as well as find similar words with the same polarity, while the standard word embedding cannot. We then used this new embedding as the first layer of EmoCNN that classifies conversational text into the 4 emotions. EmoCNN achieved macro-averaged f1-score of 0.76 over the test set. We compared EmoCNN against three different models: a shallow fully-connected neural network, fine-tuning RoBERTa, and ULMFit. These got the best macro-averaged f1-score of 0.5556, 0.6402 and 0.7386 respectively.
机译:我们展示了Emocnn,一系列专门训练的单词嵌入层和卷积神经网络模型,用于将会话文本分类为4种情绪。该模型是抑郁评估的聊天课的一部分。从对话文本分类情绪的困难是,大多数单词嵌入都是用情绪中性的语料库接受培训,例如维基百科或新闻文章,情绪词语并不经常出现,而且语言风格是正式的写作。我们以无监督的方式训练了一个新的单词嵌入式嵌入式嵌入式,然后在软标签数据上微调它。使用Emotion关键字从挖掘Twitter获得数据。我们表明,这种情绪词嵌入可以区分具有相同极性的单词与具有相反极性的单词,以及查找具有相同极性的类似单词,而标准字嵌入不能。然后,我们将此新的嵌入作为第一层Emocnn,将会话文本分类为4个情绪。 Emocnn在测试集上实现了0.76的宏观平均F1分数。我们将Emocnn与三种不同的型号进行比较:一个浅层完全连接的神经网络,微调罗伯塔和Ulmfit。这些具有0.5556,0.6402和0.7386的最佳宏观平均F1分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号