...
首页> 外文期刊>Journal of business and psychology fsponsored by the Business Psychology Research Institute >A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)
【24h】

A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)

机译:R(以及用户友好的应用程序的文本分析最佳实践建议述评

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In recent decades, the amount of text available for organizational science research has grown tremendously. Despite the availability of text and advances in text analysis methods, many of these techniques remain largely segmented by discipline. Moreover, there is an increasing number of open-source tools (R, Python) for text analysis, yet these tools are not easily taken advantage of by social science researchers who likely have limited programming knowledge and exposure to computational methods. In this article, we compare quantitative and qualitative text analysis methods used across social sciences. We describe basic terminology and the overlooked, but critically important, steps in pre-processing raw text (e.g., selection of stop words; stemming). Next, we provide an exploratory analysis of open-ended responses from a prototypical survey dataset using topic modeling with R. We provide a list of best practice recommendations for text analysis focused on (1) hypothesis and question formation, (2) design and data collection, (3) data pre-processing, and (4) topic modeling. We also discuss the creation of scale scores for more traditional correlation and regression analyses. All the data are available in an online repository for the interested reader to practice with, along with a reference list for additional reading, an R markdown file, and an open source interactive topic model tool (topicApp; see https://github.com/wesslen/topicApp, https://github.com/wesslen/text-analysis-org-science, https://dataverse.unc.edu/dataset.xhtml?persistentId=doi:10.15139/S3/R4W7ZS).
机译:近几十年来,可用于组织科学研究的文本数量巨大地增长。尽管文本分析方法中的文本和进步,但这些技术中的许多技术仍然很大程度上被纪律分割。此外,文本分析存在越来越多的开源工具(R,Python),但这些工具不容易受到社会科学研究人员的优势,他们可能具有有限的编程知识和曝光计算方法。在本文中,我们比较了社会科学中使用的定量和定性文本分析方法。我们描述了基本术语和被忽视但批判性的重要性,步骤在预处理原始文本(例如,选择停止单词;茎干)。接下来,我们提供使用与R主题建模的原型调查数据集的开放式响应的探索性分析。我们提供了专注于(1)假设和问题的文本分析的最佳实践建议列表,(2)设计和数据集合,(3)数据预处理,(4)主题建模。我们还讨论了更传统的相关性和回归分析的规模分数的创建。所有数据都可以在线存储库中使用,用于感兴趣的读者才能与参考列表一起练习,以及其他读取,R Markdown文件和开源交互式主题模型工具(TopicApp;查看https://github.com / wesslen /主题点,https://github.com/wesslen/text-analysis-orce,https://dataverse.unc.edu/dataset.xhtml?persistentid=doi:10.15139/s3/r4w7zs)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号