首页> 外文会议>IEEE International Conference on Fuzzy Systems >Twitter gender classification using user unstructured information
【24h】

Twitter gender classification using user unstructured information

机译:使用用户非结构化信息的Twitter性别分类

获取原文

摘要

This paper describes an approach to automatically detect the gender of Twitter users, based only on clues provided by their profile information in an unstructured form. A number of features that capture phenomena specific of Twitter users is proposed and evaluated on a dataset of about 242K English language users. Different supervised and unsupervised approaches are used to assess the performance of the proposed features, including Naive Bayes variants, Logistic Regression, Support Vector Machines, Fuzzy c-Means clustering, and K-means. An unsupervised approach based on Fuzzy c-Means proved to be very suitable for this task, returning the correct gender for about 96% of the users.
机译:本文介绍了一种仅基于Twitter个人资料以非结构形式提供的线索自动检测Twitter用户性别的方法。提出并捕获了一些特定于Twitter用户的现象的功能,并在大约242K英语用户的数据集上对其进行了评估。不同的监督和非监督方法用于评估所提出功能的性能,包括朴素贝叶斯变体,逻辑回归,支持向量机,模糊c均值聚类和K均值。事实证明,基于Fuzzy c-Means的无监督方法非常适合此任务,可以为约96%的用户返回正确的性别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号