首页> 外文会议>Computer Science On-line Conference >Topic-Enriched Word Embeddings for Sarcasm Identification
【24h】

Topic-Enriched Word Embeddings for Sarcasm Identification

机译:针对讽刺的讽刺词嵌入主题

获取原文
获取外文期刊封面目录资料

摘要

Sarcasm is a type of nonliteral language, where people may express their negative sentiments with the use of words with positive literal meaning, and, conversely, negative meaning words may be utilized to indicate positive sentiment. User-generated text messages on social platforms may contain sarcasm. Sarcastic utterance may change the sentiment orientation of text documents from positive to negative, or vice versa. Hence, the predictive performance of sentiment classification schemes may be degraded if sarcasm cannot be properly handled. In this paper, we present a deep learning based approach to sarcasm identification. In this regard, the predictive performance of topic-enriched word embedding scheme has been compared to conventional word-embedding schemes (such as, word2vec, fastText and GloVe). In addition to word-embedding based feature sets, conventional lexical, pragmatic, implicit incongruity and explicit incongruity based feature sets are considered. In the experimental analysis, six subsets of Twitter messages have been taken into account, ranging from 5000 to 30.000. The experimental analysis indicate that topic-enriched word embedding schemes utilized in conjunction with conventional feature sets can yield promising results for sarcasm identification.
机译:讽刺是一种非寄生语言,人们可以通过使用具有积极字面意义的单词来表达他们的负面情绪,并且相反地,可以利用负面意义词来表明积极的情绪。在社交平台上的用户生成的文本消息可能包含讽刺。讽刺的话语可能会将文本文件的情绪取向从正为负,反之亦然。因此,如果无法正确处理讽刺,情绪分类方案的预测性能可能会降低。在本文中,我们提出了一种基于深入的学习方法来讽刺识别。在这方面,与传统的单词嵌入方案(例如,Word2Vec,FastText和手套)进行了比较了主题富集的单词嵌入方案的预测性能。除了基于单词的嵌入的特征集之外,考虑传统的词汇,语用,隐式不协调和显式不协调的特征集。在实验分析中,已经考虑了六个Twitter消息,范围从5000到30.000。实验分析表明,与传统特征集结合使用的主题富集的单词嵌入方案可以为讽刺识别产生有希望的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号