【24h】

Processing Judeo-Arabic Texts

机译:处理犹太阿拉伯文字

获取原文
获取原文并翻译 | 示例

摘要

Judeo-Arabic is a set of dialects spoken and written by Jewish communities living in Arab countries. Judeo-Arabic is typically written in Hebrew letters, enriched with diacritic marks that relate to the underlying Arabic. However, some inconsistencies in rendering words in Hebrew letters increase the level of ambiguity of a given word. Furthermore, Judeo-Arabic texts usually contain non-Arabic words and phrases, such as quotations or borrowed words from Hebrew and Aramaic. We focus on two main tasks: (1) automatic transliteration of Judeo-Arabic Hebrew letters into Arabic letters, and (2) automatic identification of language switching points between Judeo-Arabic and Hebrew. For transliteration, we employ a statistical translation system trained on the character level, resulting in 96.9% precision, a significant improvement over the baseline. For the language switching task, we use a word-level supervised classifier, also showing some significant improvements over the baseline.
机译:阿拉伯犹太人(Judeo-Arabic)是生活在阿拉伯国家的犹太人社区所讲和写的一组方言。犹太阿拉伯语通常用希伯来语字母书写,并带有与下层阿拉伯语相关的变音符号。但是,希伯来字母中的某些单词呈现不一致会增加给定单词的歧义程度。此外,犹太阿拉伯语文本通常包含非阿拉伯语单词和短语,例如希伯来语和阿拉姆语的引文或借来的单词。我们专注于两个主要任务:(1)将阿拉伯-阿拉伯联合酋长国的犹太字母自动音译,以及(2)阿拉伯和阿拉伯之间的语言转换点的自动识别。对于音译,我们采用了经过字符级别训练的统计翻译系统,可达到96.9%的准确度,比基线有了显着提高。对于语言切换任务,我们使用单词级监督分类器,该分类器还显示了在基线上的一些显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号