【24h】

Persian in MULTEXT-East Framework

机译:波斯语东框架中的波斯语

获取原文

摘要

Farsi, also known as Persian, is the official language of Iran, Tajikistan and one of the two main languages spoken in Afghanistan. It is an Indo-European agglutinating language, written in Arabic script. This paper presents the first step in creating Farsi basic language resources kit. This Step comprises the specifications for morphosyntactic encoding, which is based on the EAGLES/MULTEXT model and specific resources of MULTEXT-East. This paper introduces the language i.e. Farsi, with an emphasis on its writing system and morphological properties, and its specifications. Two other important issues introduced in this paper are; one, a novel Part of Speech (PoS) categorization and, the other, a unified orthography of Farsi in digital environment. A lexicon and an annotated corpus are under preparation.
机译:Farsi也被称为波斯语,是伊朗,塔吉克斯坦的官方语言,也是阿富汗的两种主要语言之一。它是一种用阿拉伯语剧本编写的欧洲凝聚语言。本文介绍了创建Farsi基础语言资源套件的第一步。该步骤包括对形态单型编码的规范,其基于Eagles / Multext模型和Multext-East的特定资源。本文介绍了语言,即Farsi,重点是其写作系统和形态学性质及其规格。本文介绍的另外两个重要问题是;一,一个新颖的演讲(POS)分类,另一部分,另一个,是数字环境中波斯语的统一拼影。莱克西森和注释的语料库是准备的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号