SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches

机译：SSN_NLP在Semeval-2019任务6：使用传统和深机器学习方法的社交媒体中的攻击性语言识别

获取原文

页面导航

摘要
著录项
相关主题

摘要

Offensive language identification (OLI) in user generated text is automatic detection of any profanity, insult, obscenity, racism or vulgarity that degrades an individual or a group. It is helpful for hate speech detection, flame detection and cyber bullying. Due to immense growth of accessibility to social media, OLI helps to avoid abuse and hurts. In this paper, we present deep and traditional machine learning approaches for OLI. In deep learning approach, we have used bi-directional LSTM with different attention mechanisms to build the models and in traditional machine learning. TF-IDF weighting schemes with classifiers namely Multinomial Naive Bayes and Support Vector Machines with Stochastic Gradient Descent optimizer are used for model building. The approaches are evaluated on the OffensEval@SemEval2019 dataset and our team SSN_NLP submitted runs for three tasks of OffensEval shared task. The best runs of SSN_NLP obtained the F1 scores as 0.53,0.48, 0.3 and the accuracies as 0.63, 0.84 and 0.42 for the tasks A, B and C respectively. Our approaches improved the base line Fl scores by 12%, 26% and 14% for Task A, B and C respectively.

机译：用户生成文本中的令人反感语言识别（OLI）是自动检测任何亵渎，侮辱，淫秽，种族主义或粗俗，可降级个人或一组。它有助于仇恨语音检测，火焰检测和网络欺凌。由于对社交媒体的可达性增长，OLI有助于避免滥用和伤害。在本文中，我们为Oli提供了深度和传统的机器学习方法。在深度学习方法中，我们使用了双向LSTM，具有不同的关注机制来构建模型和传统机器学习。具有分类器的TF-IDF加权方案即多项式天真贝叶斯和带有随机梯度下降优化器的支持向量机，用于模型建筑。这些方法在offenseval @ seveval2019数据集中进行了评估，我们的团队SSN_NLP提交了三项任务的运行，为违法行为共享任务。对于任务A，B和C，最佳的SSN_NLP获得F1分数为0.53,0.48，0.3，0.3和0.42的精度。我们的方法分别将基线FL分别提高了12％，26％和14％的任务A，B和C.

著录项

来源
《Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies 》|2019年|xlvi p. 662-1323|共6页
会议地点
作者
D. Thenmozhi; B. Senthil Kumar; Chandrabose Aravindan; S. Srinethe;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程 ;
关键词

SSN_NLP at SemEval-2019 Task 6: Offensive Language Identification in Social Media using Traditional and Deep Machine Learning Approaches

摘要

著录项

相关主题

期刊订阅