首页> 外文期刊>Natural language engineering >Measuring diachronic language distance using perplexity:Application to English, Portuguese, and Spanish
【24h】

Measuring diachronic language distance using perplexity:Application to English, Portuguese, and Spanish

机译:使用困惑测量探讨语言距离:应用于英语,葡萄牙语和西班牙语

获取原文
获取原文并翻译 | 示例

摘要

The objective of this work is to set a corpus-driven methodology to quantify automatically diachronic language distance between chronological periods of several languages. We apply a perplexity-based measure to written text representing different historical periods of three languages: European English, European Portuguese, and European Spanish. For this purpose, we have built historical corpora for each period, which have been compiled from different open corpus sources containing texts as close as possible to its original spelling. The results of our experiments show that a diachronic language distance based on perplexity detects the linguistic evolution that had already been explained by the historians of the three languages. It is remarkable to underline that it is an unsupervised multilingual method which only needs a raw corpora organized by periods.
机译:这项工作的目的是设定一种语料库驱动的方法,以在几种语言的时间顺序之间量化自动探讨历时的语言距离。我们将基于困惑的措施应用于书面文本,代表三种语言的不同历史时期:欧洲英语,欧洲葡萄牙和欧洲西班牙语。为此目的,我们为每个时期建立了历史的Corpora,这些公司已经从包含文本的不同开放语料库源编译到其原始拼写。我们的实验结果表明,基于困惑的历入语言距离检测到三种语言历史学家已经解释的语言演化。强调它是一种无人传播的多语言方法,只需要一个由时期组织的RAW Corpora。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号