首页> 外文会议>International Conference on Information Technology >Natural Language Processing Based Features for Sarcasm Detection: An Investigation Using Bilingual Social Media Texts
【24h】

Natural Language Processing Based Features for Sarcasm Detection: An Investigation Using Bilingual Social Media Texts

机译:基于自然语言处理的讽刺检测特征:使用双语社交媒体文本进行调查

获取原文

摘要

The presence of sarcasm in text can hamper the performance of sentiment analysis. The challenge is to detect the existence of sarcasm in texts. This challenge is compounded when bilingual texts are considered, for example using Malay social media data. In this paper a feature extraction process is proposed to detect sarcasm using bilingual texts; more specifically public comments on economic related posts on Facebook. Four categories of feature that can be extracted using natural language processing are considered; lexical, pragmatic, prosodic and syntactic. We also investigated the use of idiosyncratic feature to capture the peculiar and odd comments found in a text. To determine the effectiveness of the proposed process, a non-linear Support Vector Machine was used to classify texts, in terms of the identified features, according to whether they included sarcastic content or not. The results obtained demonstrate that a combination of syntactic, pragmatic and prosodic features produced the best performance with an F-measure score of 0.852.
机译:文本中的讽刺的存在可以妨碍情绪分析的表现。挑战是检测文本中讽刺的存在。当考虑双语文本时,这种挑战是复杂的,例如使用马来的社交媒体数据。在本文中,提出了一种特征提取过程来使用双语文本检测讽刺;更具体地说是关于Facebook上的经济相关帖子的公开评论。可以考虑使用自然语言处理可以提取的四类功能;词汇,务实,韵律和句法。我们还调查了使用特殊功能来捕获文本中发现的特殊和奇数评论。为了确定所提出的过程的有效性,根据它们是否包括讽刺的内容,使用非线性支持向量机以分类文本来对文本进行分类。获得的结果表明,句法,务实和韵律特征的组合产生了最佳性能,F测量得分为0.852。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号