首页> 外文会议>International conference on computational linguistics >Automatic Discovery of Adposition Typology
【24h】

Automatic Discovery of Adposition Typology

机译:自动发现adposition类型

获取原文

摘要

Natural languages (NL) can be classified as prepositional or postpositional based on the order of the noun phrase and the adposition. Categorizing a language by its adposition typology helps in addressing several challenges in linguistics and natural language processing (NLP). Understanding the adposition typologies for less-studied languages by manual analysis of large text corpora can be quite expensive, yet automatic discovery of the same has received very little attention till date. This research presents a simple unsupervised technique to automatically predict the adposition typology for a language. Most of the function words of a language are adpositions, and we show that function words can be effectively separated from content words by leveraging differences in their distributional properties in a corpus. Using this principle, we show that languages can be classified as prepositional or postpositional based on the rank correlations derived from entropies of word co-occurrence distributions. Our claims are substantiated through experiments on 23 languages from ten diverse families, 19 of which are correctly classified by our technique.
机译:自然语言(NL)可以根据名词短语和adposition的顺序被归类为介词或后定位。通过其adposition类型进行对语言进行分类有助于解决语言学和自然语言处理(NLP)中的几个挑战。了解通过手动分析大型文本语料库的较低学习语言的Adposition Typolies可以是非常昂贵的,但自动发现相同的人收到了很少的关注到达日期。本研究提出了一种简单的无监督技术,可自动预测语言的adposition类型。语言的大多数功能词是adpositions,我们示出了通过利用语料库中的分布属性的差异来有效地与内容词一起与内容词一起分离。使用此原则,我们显示语言可以基于从单词共同发生分布的熵派生的等级相关性被归类为介词或后退。我们的索赔是通过来自10家多元家庭的23种语言的实验证实,其中19种是我们的技术正确归类的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号