Using Google Books Ngram in Detecting Linguistic Shifts over Time

机译：使用Google Books Ngram检测语言班次随着时间的推移

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The availability of large historical corpora, such as Google Books Ngram, makes it possible to extract various meta information about the evolution of human languages. Together with advances in machine learning techniques, researchers recently use the huge corpora to track cultural and linguistic shifts in words and terms over time. In this paper, we develop a new approach to quantitatively recognize semantic changes of words during the period between 1800 and 1990. We use the state-of-the-art FastText approach to construct word embedding for Google Books Ngram corpus for the decades within the time period 1800-1990. We use a time series analysis to identify words that have a statistically significant change in the period between 1900 and 1990. We conduct a performance evaluation study to compare our approach against related work, we show that our system is more robust against morphological language variations.

机译：大型历史小组的可用性，例如Google Books Ngram，可以提取有关人类演变的各种元信息。研究人员最近与机器学习技术的进步一起使用巨大的Corpora跟踪文字和语言随着时间的推移。在本文中，我们开发了一种新的方法来定量识别在1800和1990年至1990年期间的单词的语义变化。我们使用最先进的FastText方法来构建Google书籍Ngram Corpus的嵌入式内容时间段1800-1990。我们使用时间序列分析来识别在1900和1990年之间的期间具有统计上显着变化的词语。我们开展绩效评估研究，以比较我们对相关工作的方法，我们表明我们的系统对形态语言变化更加强大。

著录项

来源
《International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management》|2018年|1(CD-ROM)|共8页
会议地点
作者
Alaa El-Ebshihy; Nagwa El-Makky; Khaled Nagi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G354-53;
关键词
Linguistic shift; Semantic change; Google books ngram; FastText; Time series analysis; Computational linguistics;

机译：语言转移;语义变化;谷歌书籍ngram;FastText;时间序列分析;计算语言学;

相似文献

外文文献
中文文献
专利

1. The impact of lacking metadata for the measurement of cultural and linguistic change using the Google Ngram data sets-Reconstructing the composition of the German corpus in times of WWII [J] . Koplenig Alexander Literary & linguistic computing . 2017,第1期

机译：缺少元数据对使用Google Ngram数据集衡量文化和语言变化的影响-重建第二次世界大战期间德国语料库的组成
2. Ranking concrete and abstract words using Google Books Ngram data [J] . Ivanov Vladimir, Solovyev Valery Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第2Pta2期

机译：使用Google书籍数据排名具体和抽象单词
3. GOOGLE BOOKS NGRAMS RECOMPRESSED AND SEARCHABLE [J] . Szymon GRABOWSKI, Jakub SWACHA Foundations of computing and decision sciences . 2012,第4期

机译：GOOGLE BOOKS NGRAMS重新压缩和搜索
4. Using Google Books Ngram in Detecting Linguistic Shifts over Time [C] . Alaa El-Ebshihy, Nagwa El-Makky, Khaled Nagi International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management . 2018

机译：使用Google Books Ngram检测语言班次随着时间的推移
5. Exploring the Google Books Corpus: An Information-Theoretic Approach to Linguistic Evolution. [D] . Pechenick, Eitan Adam. 2015

机译：探索Google图书语料库：信息进化的语言理论方法。
6. Characterizing the Google Books Corpus: Strong Limits to Inferences of Socio-Cultural and Linguistic Evolution [O] . Eitan Adam Pechenick, Christopher M. Danforth, Peter Sheridan Dodds -1

机译：表征Google图书语料库：推论社会文化和语言发展的严格限制
7. Enhanced Search with Wildcards and Morphological Inflections in the Google Books Ngram Viewer [O] . Jason Mann, David Zhang, Lu Yang, 2015

机译：在Google图书Ngram查看器中使用通配符和形态变化进行增强搜索

Using Google Books Ngram in Detecting Linguistic Shifts over Time

摘要

著录项

相似文献

相关主题

期刊订阅