【24h】

Detecting Offensive Language in Social Media to Protect Adolescent Online Safety

机译:在社交媒体中检测攻击性语言以保护青少年在线安全

获取原文

摘要

Since the textual contents on online social media are highly unstructured, informal, and often misspelled, existing research on message-level offensive language detection cannot accurately detect offensive content. Meanwhile, user-level offensiveness detection seems a more feasible approach but it is an under researched area. To bridge this gap, we propose the Lexical Syntactic Feature (LSF) architecture to detect offensive content and identify potential offensive users in social media. We distinguish the contribution of pejoratives/profanities and obscenities in determining offensive content, and introduce hand-authoring syntactic rules in identifying name-calling harassments. In particular, we incorporate a user's writing style, structure and specific cyber bullying content as features to predict the user's potentiality to send out offensive content. Results from experiments showed that our LSF framework performed significantly better than existing methods in offensive content detection. It achieves precision of 98.24% and recall of 94.34% in sentence offensive detection, as well as precision of 77.9% and recall of 77.8% in user offensive detection. Meanwhile, the processing speed of LSF is approximately 10msec per sentence, suggesting the potential for effective deployment in social media.
机译:由于在线社交媒体上的文本内容非常非结构化,非正式和经常拼写错误,因此对消息级攻击性语言检测的现有研究无法准确地检测冒犯内容。同时,用户级抗攻击性检测似乎是更可行的方法,但它是在研究区域。为了弥引这种差距,我们提出了词汇句法特征(LSF)架构来检测冒犯内容,并在社交媒体中识别潜在的冒犯用户。我们在确定冒犯内容中的挑战和淫秽和淫秽的贡献,并在识别识别姓名骚扰时介绍手工创作的句法规则。特别是,我们将用户的写作风格,结构和特定的网络欺凌内容作为特征,以预测用户发送冒犯内容的潜力。实验结果表明,我们的LSF框架明显优于令人反感的内容检测中现有方法。它达到98.24%的精确度,召回了94.34%的判决进攻检测,以及77.9%的精度,用户进攻检测召回77.8%。同时,LSF的处理速度约为每句约为10毫秒,表明社交媒体中有效部署的可能性。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号