首页> 外文会议>Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies;International workshop on semantic evaluation >CAMsterdam at SemEval-2019 Task 6: Neural and graph-based feature extraction for the identification of offensive tweets
【24h】

CAMsterdam at SemEval-2019 Task 6: Neural and graph-based feature extraction for the identification of offensive tweets

机译:CAMsterdam在SemEval-2019上的任务6:基于神经和基于图形的特征​​提取来识别攻击性推文

获取原文

摘要

We describe the CAMsterdam team entry to the SemEval-2019 Shared Task 6 on offensive language identification in Twitter data. Our proposed model learns to extract textual features using a multi-layer recurrent network, and then performs text classification using gradient-boosted decision trees (GBDT). A self-attention architecture enables the model to focus on the most relevant areas in the text. We additionally learn globally optimised em-beddings for hashtags using node2vec, which are given as additional tweet features to the GBDT classifier. Our best model obtains 78.79% macro Fl-score on detecting offensive language (subtask A), 66.32% on categorising offence types (targeted/untargeted; subtask B), and 55.36% on identifying the target of offence (subtask C).
机译:我们将描述CAMsterdam团队在Twitter数据中有关攻击性语言识别的SemEval-2019共享任务6的条目。我们提出的模型学习使用多层递归网络提取文本特征,然后使用梯度增强决策树(GBDT)进行文本分类。自我关注的体系结构使模型可以专注于文本中最相关的区域。我们还使用node2vec学习了针对标签的全局优化em-bedding,这些嵌入是GBDT分类器的附加推文功能。我们的最佳模型在检测到攻击性语言(子任务A)时获得78.79%的宏Fl分数,在对犯罪类型进行分类(目标/非目标;子任务B)中获得66.32%,在识别犯罪目标时获得55.36%(子任务C)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号