【24h】

English to Persian Transliteration

机译:英语到波斯音译

获取原文

摘要

Persian is an Indo-European language written using Arabic script, and is an official language of Iran, Afghanistan, and Tajikistan. Transliteration of Persian to English—that is, the character-by-character mapping of a Persian word that is not readily available in a bilingual dictionary—is an unstudied problem. In this paper we make three novel contributions. First, we present performance comparisons of existing grapheme-based transliteration methods on English to Persian. Second, we discuss the difficulties in establishing a corpus for studying transliteration. Finally, we introduce a new model of Persian that takes into account the habit of shortening, or even omitting, runs of English vowels. This trait makes transliteration of Persian particularly difficult for phonetic based methods. This new model outperforms the existing grapheme based methods on Persian, exhibiting a 24% relative increase in transliteration accuracy measured using the top-5 criteria.
机译:波斯语是使用阿拉伯语剧本编写的印度欧洲语言,是伊朗,阿富汗和塔吉克斯坦的官方语言。波斯语对英语的音译 - 也就是说,在双语词典中不容易获得的波斯词的字符字符映射 - 是一个不含有的问题。在本文中,我们提出了三个新的贡献。首先,我们呈现出对Persian的英语现有的基于Graineme的音译方法的性能比较。其次,我们讨论建立学习音译的语料库的困难。最后,我们介绍了一个新的波斯模型,考虑到缩短,甚至省略英语元音的习惯。这种特性使波斯的音译性特别难以基于语音的方法。这一新模型优于基于Persian的现有图形方法,表现出使用前5个标准测量的音译精度的24%相对增加。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号