Understanding US regional linguistic variation with Twitter data analysis

Huang Yuan; Guo Diansheng; Kasakoff Alice; Grieve Jack

首页> 外文期刊>Computers，environment and urban systems >Understanding US regional linguistic variation with Twitter data analysis

【24h】

Understanding US regional linguistic variation with Twitter data analysis

机译：通过Twitter数据分析了解美国区域语言变化

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013-Oct. 2014) to understand the regional linguistic variation in the U.S. Prior work on regional linguistic variations usually took a long time to collect data and focused on either rural or urban areas. Geo-tagged Twitter data offers an unprecedented database with rich linguistic representation of fine spatiotemporal resolution and continuity. From the one-year Twitter corpus, we extract lexical characteristics for twitter users by summarizing the frequencies of a set of lexical alternations that each user has used. We spatially aggregate and smooth each lexical characteristic to derive county based linguistic variables, from which orthogonal dimensions are extracted using the principal component analysis (PCA). Finally a regionalization method is used to discover hierarchical dialect regions using the PCA components. The regionalization results reveal interesting linguistic regional variations in the U.S. The discovered regions not only confirm past research findings in the literature but also provide new insights and a more detailed understanding of very recent linguistic patterns in the U.S. (C) 2015 Elsevier Ltd. All rights reserved.

机译：我们分析了一年（2013年10月至2014年10月）的带有地理标签的推文的大数据集，以了解美国的区域语言变化。以前，关于区域语言变化的工作通常需要很长时间来收集数据，并且关注于农村或城市地区。带有地理标签的Twitter数据提供了一个空前的数据库，具有丰富的语言表示形式，可以实现精细的时空分辨率和连续性。从一年的Twitter语料库中，我们通过汇总每个用户使用的一组词汇替换的频率来提取Twitter用户的词汇特征。我们在空间上汇总和平滑每个词汇特征，以得出基于县的语言变量，使用主成分分析（PCA）从中提取正交维度。最后，使用区域化方法来使用PCA组件发现分层的方言区域。区域化结果揭示了美国有趣的语言区域差异。发现的区域不仅证实了文献中的过往研究结果，而且还提供了新的见解和对美国（C）2015 Elsevier Ltd.最近语言模式的更详细的理解。版权所有保留。

著录项

来源
《Computers，environment and urban systems》 |2016年第9期|244-255|共12页
作者
Huang Yuan; Guo Diansheng; Kasakoff Alice; Grieve Jack;
展开▼
作者单位

Univ South Carolina, Dept Geog, 709 Bull St,Room 127, Columbia, SC 29208 USA;

Univ South Carolina, Dept Geog, 709 Bull St,Room 127, Columbia, SC 29208 USA;

Univ South Carolina, Dept Geog, 709 Bull St,Room 127, Columbia, SC 29208 USA;

Aston Univ, Sch Languages & Social Sci, Birmingham B4 7ET, W Midlands, England;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Social media; Linguistic; Twitter; American dialects; Regionalization; US regions; Spatial data mining;

机译：社交媒体;语言学;Twitter;美国方言;地区化;美国地区;空间数据挖掘;

相似文献

外文文献
中文文献
专利

1. Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018 [J] . Isobelle Clarke, Jack Grieve PLoS One . 2019,第9期

机译：唐纳德特朗普Twitter账户的风格变化：2009年至2018年间发布的推文的语言分析
2. Applying linguistic methods to understanding smoking-related conversations on Twitter [J] . Ashley Sanders-Jackson, Cati G Brown, Judith J Prochaska Tobacco control . 2015,第2期

机译：应用语言方法来了解Twitter上与吸烟有关的对话
3. CNN for situations understanding based on sentiment analysis of twitter data [J] . Shiyang Liao, Junbo Wang, Ruiyun Yu, Procedia Computer Science . 2017,第1期

机译：基于Twitter数据情感分析的CNN情境理解
4. Linguistic Analysis of Tweets – Using Data Mining to study usage of English on Twitter [C] . Puru Malhotra, Yugam Bajaj International Conference on Computational Intelligence and Communication Networks . 2020

机译：推文的语言分析–使用数据挖掘研究Twitter上英语的用法
5. Understanding the spatio-temporal characteristics of twitter data with geo-tagged and non geo-tagged content: Two case studies with the topic of flu and Ted (movie) [D] . Issa, Elias. 2016

机译：了解具有地理标签和非地理标签内容的Twitter数据的时空特性：两个以flu和Ted（电影）为主题的案例研究
6. Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018 [O] . Isobelle Clarke, Jack Grieve 2012

机译：唐纳德·特朗普Twitter帐户上的文体变化：2009年至2018年发布的推文的语言分析
7. Understanding U.S. regional linguistic variation with Twitter data analysis [O] . Yuan, Huang, Guo, Diansheng, Kasakoff, Alice, 2016

机译：通过Twitter数据分析了解美国区域语言变化

Understanding US regional linguistic variation with Twitter data analysis

摘要

著录项

相似文献

相关主题

期刊订阅