...
首页> 外文期刊>Neural Networks: The Official Journal of the International Neural Network Society >A multi-label, semi-supervised classification approach applied to personality prediction in social media
【24h】

A multi-label, semi-supervised classification approach applied to personality prediction in social media

机译:应用于社交媒体的人格预测的多标签,半监督分类方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naive Bayes classifier, a Support Vector Machine, and a Multilayer Perception neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others.
机译:社交媒体允许Web用户创建和共享与不同主题有关的内容,揭示其活动,意见,感受和思想。在这方面,在线社交媒体吸引了寻求理解行为和趋势的数据科学家的兴趣,同时收集社交场所的统计数据。这些数据的一个潜在应用是个性预测,旨在了解用户在社交媒体内的行为。传统的个性预测依赖于用户的简档,他们的状态更新,他们发布的消息等,引入了社交媒体数据的个性预测系统,其与文献中的大多数方法不同,因为它适用于文本组,而不是单个文本,也不会考虑用户的简档。此外,所提出的方法从文本中提取元属性,并且不直接使用消息的内容。这组可能的个性特征是从大五个模型中取出的,并且允许问题表征为多标签分类任务。然后将问题转换为一组五个二进制分类问题,并通过半监督学习方法解决,因为难以注释社交媒体中产生的大量数据。在我们的实施中,拟议的系统培训了三个知名机器学习算法,即天真凸鸟分类器,支持向量机和多层感知神经网络。该系统被应用于预测文献中可用的三个数据集的推文的个性,并导致大约83%的准确预测,其中一些个性特征呈现出比其他人更好的个人分类率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号