【24h】

'When Numbers Matter!': Detecting Sarcasm in Numerical Portions of Text

机译:'当数字几乎没有!':在文本的数字部分检测讽刺

获取原文

摘要

Research in sarcasm detection spans almost a decade. However a particular form of sarcasm remains unexplored: sarcasm expressed through numbers, which we estimate, forms about 11% of the sarcastic tweets in our dataset. The sentence 'Love waking up at 3 am' is sarcastic because of the number. In this paper, we focus on detecting sarcasm in tweets arising out of numbers. Initially, to get an insight into the problem, we implement a rule-based and a statistical machine learning-based (ML) classifier. The rule-based classifier conveys the crux of the numerical sarcasm problem, namely, incongruity arising out of numbers. The statistical ML classifier uncovers the indicators i.e., features of such sarcasm. The actual system in place, however, are two deep learning (DL) models, CNN and attention network that obtains an F-score of 0.93 and 0.91 on our dataset of tweets containing numbers. To the best of our knowledge, this is the first line of research investigating the phenomenon of sarcasm arising out of numbers, culminating in a detector thereof.
机译:讽刺检测跨越几十年的研究。然而,特定形式的讽刺仍然是未开发的:通过我们估计的数字表达的讽刺,在我们数据集中的讽刺推文中形成了大约11%。由于数字,句子“爱情在凌晨3点醒来”是讽刺的。在本文中,我们专注于检测由数字引起的推文中的讽刺。最初,为了了解问题,我们实现了基于规则的基于统计机器学习(ML)分类器。基于规则的分类器传达了数值讽刺问题的关键,即,由于数字而产生的不协调。统计ML分类器揭示了这种讽刺的指标。然而,实际系统到位是两个深度学习(DL)模型,CNN和注意网络,在我们的推文的数据集上获得0.93和0.91的F分数。据我们所知,这是第一批研究调查由数量引起的讽刺现象,最终在其探测器中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号