【24h】

What We Can Learn from Looking at Profanity

机译:我们可以从亵渎中学到什么

获取原文

摘要

Profanity is a common occurrence in online text. Recent studies found swearing words in over 7% of English tweets and 9% of Yahoo! Buzz messages. However, efforts in recognizing, understanding and dealing with profanity do not share resources, namely, their dataset, which imposes duplication of effort and non-comparable results. We here present a freely available dataset of 2500 messages from a popular Portuguese sports website. About 20% of the messages had profanity, thus we annotated 726 swear words, 510 of which were obfuscated by the authors. We also identified the most frequent profanities, and what methods, and combination of methods, people used to disguise their cursing.
机译:亵渎行为是在线文本中的常见现象。最近的研究发现,超过7%的英语推文和9%的Yahoo!嗡嗡声消息。但是,在识别,理解和处理亵渎行为上的努力并没有共享资源,即它们的数据集,这会导致工作重复和结果无法比拟。我们在这里提供了一个免费的可用数据集,该数据集来自一个受欢迎的葡萄牙体育网站的2500条消息。大约20%的消息带有亵渎性,因此我们注释了726个脏话,其中510个被作者混淆了。我们还确定了人们最常使用的亵渎,以及人们用来掩饰自己的诅咒的方法和方法的组合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号