首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Detecting Offensive Tweets in Hindi-English Code-Switched Language
【24h】

Detecting Offensive Tweets in Hindi-English Code-Switched Language

机译:在印度英语代码交换语言中检测到令人反感的推文

获取原文
获取外文期刊封面目录资料

摘要

The exponential rise of social media websites like Twitter, Facebook and Reddit in linguistically diverse geographical regions has led to hybridization of popular native languages with English in an effort to ease communication. The paper focuses on the classification of offensive tweets written in Hinglish language, which is a portmanteau of the Indic language Hindi with the Roman script. The paper introduces a novel tweet dataset, titled Hindi-English Offensive Tweet (HEOT) dataset, consisting of tweets in Hindi-English code switched language split into three classes: non-offensive, abusive and hate-speech. Further, we approach the problem of classification of the tweets in HEOT dataset using transfer learning wherein the proposed model employing Convolutional Neural Networks is pre-trained on tweets in English followed by retraining on Hinglish tweets.
机译:在语言上不同地理区域的Twitter,Facebook和Reddit等社交媒体网站的指数升高导致了与英语流行的母语杂交,以简化沟通。本文重点介绍以HINGLISH语言编写的进攻推文的分类,这是与罗马剧本的印度语印地的Portmanteau。本文介绍了一个新的Tweet DataSet,标题为Hindi-Englished Freepore Tweet(Heot)DataSet,由印地文 - 英语代码交换语言分为三个类:非冒犯,滥用和仇恨语音。此外,我们使用转移学习方法在Heot DataSet中分类推文的分类问题,其中采用卷积神经网络的提出模型在英语推文上预先培训,然后在HINGLISH推文上进行再培训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号