首页> 外文会议>International conference on mining intelligence and knowledge exploration >A Joint Source Channel Model for the English to Bengali Back Transliteration
【24h】

A Joint Source Channel Model for the English to Bengali Back Transliteration

机译:英语到孟加拉语反向音译的联合源渠道模型

获取原文

摘要

In this paper we present an English-to-Bengali back transliteration system that can be used to transliterate Bengali texts written in Romanized English, back to its original script. Our proposed system uses a bilingual parallel corpus of English-Bengali transliterated word pairs and applies both the orthographic as well as phonetic information to two different computational models namely, the joint source channel model and the trigram model, to automatically identify, extract and learning of transliteration unit (TU) pairs from both the source and target language words. Finally, the system predicts the top 10 best possible outcome of the given input text. We further extend our work to make the target word prediction module more robust. This is done by the phonological analysis of the generated target sentence. Both the models have been evaluated with a set of 2000 Romanized Bengali test words. Our initial evaluation results clearly shows that the joint source channel model performs much better than the trigram model.
机译:在本文中,我们介绍了一种英语到孟加拉语的反向音译系统,该系统可用于将用罗马化英语编写的孟加拉语文本音译回其原始脚本。我们提出的系统使用英语-孟加拉语音译对的双语并行语料库,并将正字法和语音信息应用于两种不同的计算模型,即联合源通道模型和三字母组模型,以自动识别,提取和学习来自源语言和目标语言单词的音译单位(TU)对。最后,系统会预测给定输入文本的前十个最佳可能结果。我们进一步扩展了工作,以使目标词预测模块更强大。这是通过对生成的目标句子进行语音分析来完成的。两种模型均已使用2000个罗马化的孟加拉语测试词进行了评估。我们的初步评估结果清楚地表明,联合源通道模型的性能要比Trigram模型好得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号