首页> 外文期刊>Procedia Computer Science >A Sequence-to-Sequence based Approach For the double Transliteration of Tunisian Dialect
【24h】

A Sequence-to-Sequence based Approach For the double Transliteration of Tunisian Dialect

机译:基于序列到序列的突尼斯方言双重音译方法

获取原文
       

摘要

Transliteration consists of automatically transforming a grapheme’s transcription from one writing system to another, while preserving its pronunciation. It is usually used in the context of machine translation and cross language information retrieval, mainly to deal with the issue of named entities and technical terms. In the case of some Arabic dialects, which are used on the social web in both Latin and Arabic scripts and which are still low-resource languages, transliteration is of great benefit for the automatic generation of various linguistic resources (parallel corpora and lexica), useful for their automatic processing. In this work, we focus on the Tunisian dialect transliteration. We propose a deep learning based Sequence-to-Sequence approach to perform a word-level transliteration of the user generated Tunisian dialect on the social web, in both Latin to Arabic and Arabic to Latin senses.
机译:音译包括将字素的转录从一个书写系统自动转换为另一个书写系统,同时保留其发音。它通常用于机器翻译和跨语言信息检索的环境中,主要用于处理命名实体和技术术语的问题。对于某些阿拉伯语方言,它们在社交网络上以拉丁语和阿拉伯语文字使用,但仍然是资源匮乏的语言,音译对于自动生成各种语言资源(平行语料库和词汇)具有极大的好处,对它们的自动处理很有用。在这项工作中,我们重点关注突尼斯方言音译。我们提出了一种基于深度学习的序列到序列方法,以在社交网络上以拉丁语到阿拉伯语和阿拉伯语到拉丁语的含义对用户生成的突尼斯方言进行单词级音译。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号