【24h】

Automatic detection of gender on the blogs

机译:自动检测博客上的性别

获取原文

摘要

In this paper, we are interested in defining the gender of blogger while using only texts written from bloggers. For that purpose, we offer a number of features based on specific words, which were categorized into classes. For each blog, a score is calculated based on these characteristics, thereby determining the gender of its author. The evaluation was made on a corpus of 681,288 Blogs (140 million words) tagged as men or women. In our work, this collection will be taken as a reference. The obtained results show gender detection over 82% compared to the referenced collection.
机译:在本文中,我们有兴趣在仅使用博主写的文本时定义博主的性别。为此目的,我们提供了许多基于特定单词的特征,该功能被分类为类。对于每个博客,基于这些特征来计算分数,从而确定其作者的性别。评估是在681,288个博客(1.4亿字)的语料库上制作,被标记为男性或女性。在我们的工作中,该系列将被视为参考。与参考收集相比,所得结果显示出超过82%的性别检测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号