...
首页> 外文期刊>Quality & Quantity >Thematic content analysis using supervised machine learning: An empirical evaluation using German online news
【24h】

Thematic content analysis using supervised machine learning: An empirical evaluation using German online news

机译:使用监督式机器学习的主题内容分析:使用德国在线新闻进行的经验评估

获取原文
获取原文并翻译 | 示例

摘要

In recent years, two approaches to automatic content analysis have been introduced in the social sciences: semantic network analysis and supervised text classification. We argue that, although less linguistically sophisticated than semantic parsing techniques, statistical machine learning offers many advantages for applied communication research. By using manually coded material for training, supervised classification seamlessly bridges the gap between traditional and automatic content analysis. In this paper, we briefly introduce the conceptual foundations of machine learning approaches to text classification and discuss their application in social science research. We then evaluate their potential in an experimental study in which German online news was coded with established thematic categories. Moreover, we investigate whether and how linguistic preprocessing can improve classification quality. Results indicate that supervised text classification is generally robust and reliable for some categories, but may even be useful when it fails.
机译:近年来,社会科学中引入了两种自动内容分析方法:语义网络分析和监督文本分类。我们认为,尽管统计语言学习虽然不如语义解析技术那么复杂,但为应用程序通信研究提供了许多优势。通过使用人工编码的材料进行培训,监督分类可以无缝地弥合传统内容分析和自动内容分析之间的差距。在本文中,我们简要介绍了机器学习方法用于文本分类的概念基础,并讨论了它们在社会科学研究中的应用。然后,我们在一项实验研究中评估其潜力,在该实验研究中,德国在线新闻已按既定主题类别进行编码。此外,我们研究了语言预处理是否以及如何提高分类质量。结果表明,有监督的文本分类对于某些类别通常是可靠且可靠的,但当分类失败时甚至可能会很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号