【24h】

jhan014 at SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media

机译:jhan014在SemEval-2019任务6:社交媒体中冒犯性语言的识别和分类

获取原文

摘要

In this paper, the team jhan014 presents two methods to identify and categorize the offensive language in Twitter. In the first method, we develop a deep neural network consisting of bidirectional recurrent layers with Gated Recurrent Unit (GRU) cells and fully connected layers. In the second method, we establish a probabilistic model, modified sentence offensiveness calculation (MSOC) to evaluate the sentence offensiveness level and target level according to different sub-tasks. Based on task results. We evaluate the performance of each method based on F1 score and analyze the advantages and disadvantages of these two methods with the type I error and type II error. In conclusion, deep neural network behaves well in all subtasks but has more type I error and fails to categorize subclasses with minor data or less character, while MSOC model does better in target categorizing but has more type II error in offensive identifying.
机译:在本文中,团队jhan014提出了两种在Twitter中识别和分类攻击性语言的方法。在第一种方法中,我们开发了一个深度神经网络,该网络由带有门控循环单元(GRU)单元的双向循环层和完全连接的层组成。在第二种方法中,我们建立了一个概率模型,即改进的句子攻击性计算(MSOC),以根据不同的子任务评估句子的攻击性水平和目标水平。基于任务结果。我们根据F1分数评估每种方法的性能,并分析这两种方法的I型错误和II型错误的优缺点。总之,深度神经网络在所有子任务中均表现良好,但具有更多的I型错误,无法对数据较少或特征较少的子类进行分类,而MSOC模型在目标分类中表现更好,但在进攻识别方面则具有更多的II型错误。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号