首页> 外文会议>Workshop on Figurative Language Processing >Sarcasm Detection Using an Ensemble Approach
【24h】

Sarcasm Detection Using an Ensemble Approach

机译:使用集合方法进行讽刺检测

获取原文

摘要

We present an ensemble approach for the detection of sarcasm in Reddit and Twitter responses in the context of The Second Workshop on Figurative Language Processing held in conjunction with ACL 20201. The ensemble is trained on the predicted sarcasm probabilities of four component models and on additional features, such as the sentiment of the comment, its length, and source (Reddit or Twitter) in order to leam which of the component models is the most reliable for which input. The component models consist of an LSTM with hashtag and emoji representations; a CNN-LSTM with casing, stop word, punctuation, and sentiment representations; an MLP based on Infersent embeddings; and an SVM trained on stylometric and emotion-based features. All component models use the two conversational turns preceding the response as context, except for the SVM, which only uses features extracted from the response. The ensemble itself consists of an adaboost classifier with the decision tree algorithm as base estimator and yields F1-scores of 67% and 74% on the Reddit and Twitter test data, respectively.
机译:我们提出了一种在与ACL 20201结合使用的比喻语言处理的第二次研讨会上检测Reddit和Twitter响应中的讽刺答复的集合方法。该集合培训了四个组件模型的预测讽刺概率和附加功能,例如评论的情绪,它的长度和源(Reddit或Twitter),以便为哪个组件模型中的哪一个最可靠的输入。组件模型由具有Hashtag和表达式表示的LSTM组成;具有套管,停止字,标点符号和情绪表示的CNN-LSTM;基于Infersent Embeddings的MLP;和SVM培训,培训了仪表训练和基于情感的特征。所有组件模型都使用响应之前的两个会话转弯作为上下文,除了SVM之外,它只使用从响应中提取的功能。该集合本身由AdaBoost分类器组成,其中具有决策树算法作为基础估计器,并分别在Reddit和Twitter测试数据上产生67%和74%的F1分数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号