首页> 外文会议>Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies >UVA Wahoos at SemEval-2019 Task 6: Hate Speech Identification using Ensemble Machine Learning
【24h】

UVA Wahoos at SemEval-2019 Task 6: Hate Speech Identification using Ensemble Machine Learning

机译:UVA WAHOOS在Semeval-2019任务6:使用集合机学习讨厌语音识别

获取原文

摘要

With the growth in the usage of social media, it has become increasingly common for people to hide behind a mask and abuse others. We have attempted to detect such tweets and comments that are malicious in intent, which either targets an individual or a group. Our best classifier for identifying offensive tweets for SubTask_A (Classifying offensive vs. non-offensive) has an accuracy of 83.14% and a f1-score of 0.7565 on the actual test data. For SubTask_B, to identify if an offensive tweet is targeted (If targeted towards an individual or a group), the classifier performs with an accuracy of 89.17% and f1-score of 0.5885. The paper talks about how we generated linguistic and semantic features to build an ensemble machine learning model. By training with more extracts from different sources (Face-book, and more tweets), the paper shows how the accuracy changes with additional training data.
机译:随着社交媒体使用的增长,人们躲在面具并虐待他人时越来越普遍。我们试图检测到意图中恶意的推文和评论,其目标是个人或一组。我们最佳分类器,用于识别用于子台拨款(攻击性与非冒犯性的攻击性与攻击性)的攻击性推文的准确性为83.14%,并且在实际测试数据上的F1分数为0.7565。对于SubTask_B,要识别是否有针对性的推文(如果针对个人或组),则分类器的精度为89.17%和f1分数为0.5885。本文讨论了我们如何生成语言和语义特征来构建集合机器学习模型。通过培训来自不同来源的更多提取物(面部书,更长的推文),该文件显示了如何使用额外的培训数据进行准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号