【24h】

Stop Words Are Not 'Nothing': German Modal Particles and Public Engagement in Social Media

机译:停用词不是“无”:德语语气词和社交媒体中的公众参与

获取原文

摘要

Social media research often exploits metrics based on frequency counts, e.g., to determine corpus sentiment. Hampton and Shalin [1] introduced an alternative metric examining the style and structure of social media relative to an Internet language baseline. They demonstrated statistically significant differences in lexical choice from tweets collected in a disaster setting relative to the standard. One explanation of this finding is that the Twitter platform, irrespective of disaster setting, and/or specifics of the English language, is responsible for the observed differences. In this paper, we apply the same metric to German corpora, to compare an event-based (the recent election) with a "nothing" crawl, with respect to the use of German modal particles. German modal particles are often used in spoken language and typically regarded as stop words in text mining. This word class is likely to reflect public engagement because of its properties, such as indicating common ground, or reference to previous utterances (i.e. anaphora) [2,3]. We demonstrate a positive deviation of most modal particles for all corpora relative to general Internet language, consistent with the view that Twitter constitutes a form of conversation. However, the use of modal particles also generally increased in the three corpora related to the 2017 German election relative to the "nothing" corpus. This indicates topic influence beyond platform affordances and supports an interpretation of the German election data as an engaged, collective narrative response to events. Using commonly eliminated features, our finding supports and extends Hampton and Shalin's analysis that relied on pre-selected antonyms and suggests an alternative method to frequency counts to identify corpora that differ in public engagement.
机译:社交媒体研究通常基于频率计数来利用指标,例如以确定语料库情绪。 Hampton和Shalin [1]引入了另一种度量标准,用于检查社交媒体相对于Internet语言基准的样式和结构。他们从灾难环境中收集的推文中发现,相对于标准而言,词汇选择在统计上存在显着差异。对此发现的一种解释是,Twitter平台与灾难的发生和/或英语的具体情况无关,是造成观察到的差异的原因。在本文中,我们对德国语料库应用了相同的度量标准,以比较基于事件的(最近的选举)与“无”爬行,以及德国语气词的使用。德语语气词通常用于口语中,在文本挖掘中通常被视为停用词。这个单词类别可能会反映公众参与的原因,因为它具有一些属性,例如表示共同点或参考以前的话语(即回指)[2,3]。我们证明,相对于一般的互联网语言,所有语料库的大多数模态粒子都存在正偏差,这与Twitter构成对话形式的观点一致。但是,相对于“无”语料库,与2017年德国大选有关的三个语料库中,模态粒子的使用通常也有所增加。这表明主题的影响力超出了平台提供的能力,并支持将德国大选数据解释为对事件的积极,集体叙事的回应。通过使用通常消除的功能,我们的发现支持并扩展了汉普顿和沙林的分析,该分析依赖于预先选择的反义词,并提出了一种频率计数的替代方法,以识别公众参与程度不同的语料库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号