首页> 外文会议>Conference on empirical methods in natural language processing >Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages
【24h】

Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages

机译:过去,现在,未来:一个计算调查1000种语言时的类型学

获取原文

摘要

We present SuperPivot, an analysis method for low-resource languages that occur in a superparallel corpus, i e., in a corpus that contains an order of magnitude more languages than parallel corpora currently in use. We show that SuperPivot performs well for the crosslingual analysis of the linguistic phenomenon of tense. We produce analysis results for more than 1000 languages, conducting - to the best of our knowledge - the largest crosslingual computational study performed to date. We extend existing methodology for leveraging parallel corpora for typological analysis by overcoming a limiting assumption of earlier work: We only require that a linguistic feature is overtly marked in a few of thousands of languages as opposed to requiring that it be marked in all languages under investigation.
机译:我们呈现SuperPivot,一种分析方法,即在超级平行语料库中出现的低资源语言,I e。,在一个语料库中,其中包含比目前正在使用的平行语言的数量级的数量级。我们表明Superpivot表现出对时态语言现象的哔哔分析。我们为1000多种语言进行分析结果,致力于我们的知识 - 迄今为止所执行的最大奇妙的计算研究。我们通过克服较早工作的限制假设来扩展用于利用并行Corpora的现有方法,以便通过较早工作的限制假设:我们只要求语言特征在数千种语言中明显标记,而不是要求它在调查中的所有语言中标记为所有语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号