首页> 外文会议> >Sentence-aligned parallel corpus Amazigh-English
【24h】

Sentence-aligned parallel corpus Amazigh-English

机译:句子对齐的平行语料库Amazigh-English

获取原文

摘要

Current research, in Natural Language Processing, shows more interest in the under-resourced languages, during last years. Amazigh language is the autochthon language of North Africa. However, until 2011 that it became a constitutionally official language in Morocco, after years of persecution. Amazigh language is still considered as one of the under resourced languages. The question is: “how can the Amazigh language reach advanced languages?” Motivated by these considerations, we describe our effort in the development of an Amazigh-English parallel corpus aimed to be used in linguistic research, teaching, and natural language processing application, primarily machine translation. To the best of our knowledge, this corpus is the first Amazigh-English parallel corpus. The built corpus is sentence aligned, including 20726 sentences. The alignment was done automatically, while the evaluation was done manually. The experimentation results are satisfactory, achieving more than 90%.
机译:在最近的自然语言处理方面的最新研究表明,对资源匮乏的语言有更多的兴趣。 Amazigh语言是北非的autochthon语言。但是,经过多年的迫害,直到2011年,它才成为摩洛哥的宪法官方语言。 Amazigh语言仍然被认为是资源匮乏的语言之一。问题是:“ Amazigh语言如何达到高级语言?”基于这些考虑,我们描述了我们在开发Amazigh-English并行语料库方面的努力,该语料库旨在用于语言研究,教学和自然语言处理应用程序,主要是机器翻译。据我们所知,该语料库是第一个Amazigh-English平行语料库。生成的语料库是对齐的句子,包括20726个句子。比对是自动完成的,而评估是手动完成的。实验结果令人满意,达到了90%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号