首页> 外文会议>Workshop on Trolling, Aggression and Cyberbullying >FlorUniTo@TRAC-2: Retrofitting Word Embeddings on an Abusive Lexicon for Aggressive Language Detection
【24h】

FlorUniTo@TRAC-2: Retrofitting Word Embeddings on an Abusive Lexicon for Aggressive Language Detection

机译:FlorUniTo @ TRAC-2:在攻击性词典上改进词嵌入以进行侵略性语言检测

获取原文

摘要

This paper describes our participation to the TRAC-2 Shared Tasks on Aggression Identification. Our team, FlorUniTo, investigated the applicability of using an abusive lexicon to enhance word embeddings towards improving detection of aggressive language. The embeddings used in our paper are word-aligned pre-trained vectors for English, Hindi, and Bengali, to reflect the languages represented in the shared task datasets. The embeddings are retrofitted to a multilingual abusive lexicon, HurtLex. We experimented with an LSTM model using the original as well as the transformed embeddings and different language and setting variations. Overall, our systems placed toward the middle of the official rankings based on weighted Fl score. Furthermore, the results on the development and test sets show promise for this novel avenue of research.
机译:本文描述了我们对TRAC-2攻击识别共享任务的参与。我们的团队FlorUniTo调查了使用辱骂词典来增强单词嵌入以改善对攻击性语言的检测的适用性。本文中使用的嵌入是针对英语,北印度语和孟加拉语的单词对齐的预训练向量,以反映共享任务数据集中表示的语言。嵌入内容被改编为多语言辱骂性词典HurtLex。我们使用原始,转换后的嵌入以及不同的语言和设置变化对LSTM模型进行了实验。总体而言,我们的系统根据加权Fl得分排在官方排名的中间。此外,开发和测试集上的结果显示出了这种新颖的研究途径的希望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号