【24h】

Building Dialectological Corpora for Turkic Languages: Mishar Dialect of Tatar

机译:建立突厥语的方言语料库:塔塔尔族的米沙尔方言

获取原文

摘要

Corpus-based dialectology of less-resourced and functionally limited native languages is a developing field of linguistics. In this paper we discuss challenges of annotating dialect corpora for Turkic languages of Russia by the example of Mishar dialect of Tatar language. Peculiarities of grammatical variability in Mishar dialect are investigated from the point of view of automatic annotation and the search functionality of the corpus is described. The proposed methodology of annotation can be used when creating multilingual integrated resources and parallel corpora of closely related languages.
机译:资源较少且功能有限的本地语言的基于语料库的方言学是语言学的发展领域。在本文中,我们以Ta塔尔语的Mishar方言为例,讨论了为俄罗斯突厥语注释方言语料库的挑战。从自动注释的角度研究了米沙尔方言的语法变异的特殊性,并描述了语料库的搜索功能。当创建多语言集成资源和紧密相关语言的并行语料库时,可以使用建议的注释方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

联系方式:18141920177 (微信同号)

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号