【24h】

What We Can Learn from Looking at Profanity

机译:我们可以从看亵渎的东西

获取原文

摘要

Profanity is a common occurrence in online text. Recent studies found swearing words in over 7% of English tweets and 9% of Yahoo! Buzz messages. However, efforts in recognizing, understanding and dealing with profanity do not share resources, namely, their dataset, which imposes duplication of effort and non-comparable results. We here present a freely available dataset of 2500 messages from a popular Portuguese sports website. About 20% of the messages had profanity, thus we annotated 726 swear words, 510 of which were obfuscated by the authors. We also identified the most frequent profanities, and what methods, and combination of methods, people used to disguise their cursing.
机译:亵渎是在线文本中的常见发生。最近的研究发现超过7%的英语推文和9%的雅虎!嗡嗡声消息。然而,努力认识到,了解和处理亵渎的不共享资源,即他们的数据集,这施加了重复的努力和不可比较的结果。我们在这里从一个流行的葡萄牙体育网站展示了一个可自由的数据集2500条消息。大约20%的信息有亵渎,因此我们注释了726个发誓,其中510个被作者对其进行了混淆。我们还确定了最常见的亵渎,以及方法和方法的组合,人们常常伪装他们的诅咒。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号