首页> 外文期刊>Computational linguistics >Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
【24h】

Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing

机译:建模语言变化和普遍性:自然语言处理的类型学语言学调查

获取原文
       

摘要

Linguistic typology aims to capture structural and semantic variation across the world’s languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that suffer from the lack of human labeled resources. We present an extensive literature survey on the use of typological information in the development of NLP techniques. Our survey demonstrates that to date, the use of information in existing typological databases has resulted in consistent but modest improvements in system performance. We show that this is due to both intrinsic limitations of databases (in terms of coverage and feature granularity) and under-utilization of the typological features included in them. We advocate for a new approach that adapts the broad and discrete nature of typological categories to the contextual and continuous nature of machine learning algorithms used in contemporary NLP. In particular, we suggest that such an approach could be facilitated by recent developments in data-driven induction of typological knowledge.
机译:语言类型学旨在捕捉世界各地语言的结构和语义变化。大规模的类型学可以为多语言自然语言处理(NLP)提供出色的指导,特别是对于那些缺少人工标记资源的语言。我们提出了关于在NLP技术发展中使用类型学信息的广泛文献调查。我们的调查表明,迄今为止,在现有类型数据库中使用信息已导致系统性能得到持续但适度的改善。我们证明这是由于数据库的固有局限性(在覆盖范围和特征粒度方面)以及数据库中所包含的类型学特征的利用不足所致。我们提倡一种新方法,使类型学类型的广泛性和离散性适应现代NLP中使用的机器学习算法的上下文和连续性。特别是,我们建议通过数据驱动的类型学知识的最新发展可以促进这种方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号