首页> 外文期刊>Computers in Human Behavior >Social media research: The application of supervised machine learning in organizational communication research
【24h】

Social media research: The application of supervised machine learning in organizational communication research

机译:社交媒体研究:监督式机器学习在组织传播研究中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Despite the online availability of data, analysis, of this information in academic research is arduous. This article explores the application of supervised machine learning (SML) to overcome challenges associated with online data analysis. In SML classifiers are used to categorize and code binary data. Based on a case study of Dutch employees' work-related tweets, this paper compares the coding performance of three classifiers, Linear Support Vector Machine, Naive Bayes, and logistic regression. The performance of these classifiers is assessed by examining accuracy, precision, recall, the area under the precision-recall curve, and Krippendorfs Alpha. These indices are obtained by comparing the coding decisions of the classifier to manual coding decisions. The findings indicate that the Linear Support Vector Machine and Naive Bayes classifiers outperform the logistic regression classifier. This study also compared the performance of these classifiers based on stratified random samples and random samples of training data. The findings indicate that in smaller training sets stratified random training samples perform better than random training samples, in large training sets (n = 4000) random samples yield better results. Finally, the Linear Support Vector Machine classifier was trained with 4000 tweets and subsequently used to categorize 578,581 tweets obtained from 430 employees. (C) 2016 Elsevier Ltd. All rights reserved.
机译:尽管可以在线获取数据,但是在学术研究中对该信息进行分析仍然是艰巨的。本文探讨了监督机器学习(SML)的应用,以克服与在线数据分析相关的挑战。在SML中,分类器用于对二进制数据进行分类和编码。基于荷兰员工与工作相关的推文的案例研究,本文比较了三个分类器(线性支持向量机,朴素贝叶斯和逻辑回归)的编码性能。这些分类器的性能通过检查准确性,精确度,召回率,精确召回曲线下的面积以及Krippendorfs Alpha进行评估。通过将分类器的编码决策与手动编码决策进行比较来获得这些索引。研究结果表明,线性支持向量机和朴素贝叶斯分类器优于逻辑回归分类器。这项研究还根据分层随机样本和训练数据的随机样本比较了这些分类器的性能。研究结果表明,在较小的训练集中,分层随机训练样本的效果要优于随机训练样本,在大型训练集中(n = 4000),随机样本的效果更好。最终,线性支持向量机分类器接受了4000条推文训练,随后用于对从430名员工那里获得的578,581条推文进行分类。 (C)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号