...
首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >A multi-label, semi-supervised classification approach applied to personality prediction in social media
【24h】

A multi-label, semi-supervised classification approach applied to personality prediction in social media

机译:一种多标签,半监督分类方法,用于社交媒体中的人格预测

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naive Bayes classifier, a Support Vector Machine, and a Multilayer Perception neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others.
机译:社交媒体允许网络用户创建和共享与不同主题相关的内容,从而展示他们的活动,观点,感受和思想。在这种情况下,在线社交媒体吸引了试图了解行为和趋势,同时收集社交网站统计信息的数据科学家的兴趣。这些数据的一种潜在应用是个性预测,其目的是了解用户在社交媒体中的行为。传统的个性预测依赖于用户的个人资料,他们的状态更新,他们发布的消息等。这里,介绍了一种社交媒体数据的个性预测系统,该系统不同于文献中的大多数方法,因为它可以处理多组文本,而不是单一文本,并且不会考虑用户的个人资料。而且,所提出的方法从文本中提取元属性,并且不能直接与消息的内容一起使用。这组可能的人格特质是从“五巨头”模型中提取的,可以将问题表征为多标签分类任务。由于难以注释社交媒体中生成的大量数据,因此该问题然后转换为五个二进制分类问题的集合,并通过半监督学习方法解决。在我们的实现中,使用三种著名的机器学习算法(即朴素贝叶斯分类器,支持向量机和多层感知神经网络)对提出的系统进行了训练。该系统被用于预测从文献中的三个数据集中获取的推文的个性,并导致大约83%的准确预测,其中某些人格特质表现出比其他人更好的个人分类率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号