Identifying Temporal Trends Based on Perplexity and Clustering: Are We Looking at Language Change?

机译：基于困惑和聚类识别时间趋势：我们正在寻找语言变化吗？

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this work we propose a data-driven methodology for identifying temporal trends in a corpus of medieval charters. We have used perplexities derived from RNNs as a distance measure between documents and then, performed clustering on those distances. We argue that perplexities calculated by such language models are representative of temporal trends. The clusters produced using the K-Means algorithm give an insight of the differences in language in different time periods at least partly due to language change. We suggest that the temporal distribution of the individual clusters might provide a more nuanced picture of temporal trends compared to discrete bins, thus providing better results when used in a classification task.

机译：在这项工作中，我们提出了一种数据驱动的方法来识别中世纪宪章中的时间趋势。我们将源自RNN的困惑用作文档之间的距离度量，然后对这些距离进行聚类。我们认为，这种语言模型计算出的困惑代表了时间趋势。使用K-Means算法产生的聚类至少在一定程度上归因于语言变化，从而洞悉了不同时间段的语言差异。我们建议，与离散仓相比，各个群集的时间分布可能提供更细微的时间趋势图，从而在分类任务中使用时可以提供更好的结果。

著录项

来源
《International workshop on computational approaches to historical language change;Annual meeting of the Association for Computational Linguistics》|2019年|86-91|共6页
会议地点 Florence(IT)
作者
Sidsel Boldsen; Manex Agirrezabal; Patrizia Paggio;
展开▼
作者单位

Centre for Language Technology University of Copenhagen;

Centre for Language Technology University of Copenhagen Institute of Linguistics and Language Technology University of Malta;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. C-TREND: Temporal Cluster Graphs for Identifying and Visualizing Trends in Multiattribute Transactional Data [J] . Adomavicius Gediminas, Bockstedt Jesse IEEE Transactions on Knowledge and Data Engineering . 2008,第6期

机译：C-TREND：时间聚类图，用于识别和可视化多属性交易数据中的趋势
2. Classifying technology patents to identify trends: Applying a fuzzy-based clustering approach in the Turkish textile industry [J] . Tuerkay Dereli, Alptekin Durmusoglu Technology in society . 2009,第3期

机译：对技术专利进行分类以识别趋势：在土耳其纺织业中应用基于模糊的聚类方法
3. Are there temporal subtypes of premenstrual dysphoric disorder?: using group-based trajectory modeling to identify individual differences in symptom change [J] . Quantum electronics . 2020,第6期

机译：是否有颞次疑似紊乱的时间亚型？：使用基于组的轨迹建模以确定症状变化的个体差异
4. Identifying Temporal Trends Based on Perplexity and Clustering: Are We Looking at Language Change? [C] . Sidsel Boldsen, Manex Agirrezabal, Patrizia Paggio International workshop on computational approaches to historical language change . 2019

机译：识别基于困惑和聚类的时间趋势：我们正在看语言变化吗？
5. Identifying temporal trends in treated sagebrush communities using remotely sensed imagery. [D] . Sant, Eric D. 2005

机译：使用遥感图像识别经过处理的鼠尾草群落的时间趋势。
6. Re: Author response regarding the Letter to the Editor about the article ‘Temporal trends in the accuracy of hospital diagnostic coding for identifying acute stroke: A population-based study’. European Stroke Journal 2019 Oct 14. DOI: 10.1177/2396987319881017 [O] . Linxin Li, Peter M Rothwell, on behalf of all authors 2020

机译：回复：作者对写给编辑的信的答复：用于诊断急性中风的医院诊断编码准确性的时空趋势：基于人群的研究。 European Stroke Journal 2019 Oct 14.DOI：10.1177 / 2396987319881017
7. Mutual Information and Perplexity Based Clustering of Dialogue Information for Dynamic Adaptation of Language Models [O] . Juan Manuel Lucas-cuesta, O Fernández-martínez, Tirso Moreno, 2013

机译：基于互信息和困惑的语言模型动态适应对话信息聚类

Identifying Temporal Trends Based on Perplexity and Clustering: Are We Looking at Language Change?

摘要

著录项

相似文献

相关主题

期刊订阅