【24h】

Bengali Basic Travel Expression Corpus: A statistical analysis

机译:孟加拉语基本旅行表达语料库:统计分析

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

The Japanese-English aligned Basic Travel Expression Corpus (BTEC) has been used as a basic dataset for development of real-world Speech-to-Speech Translation (S2ST) systems in related prior studies. This paper presents a detailed statistical analysis on the Bengali translated BTEC text and its phonetic transcriptions for development of English-Bengali speech translation applications in travel domain. In different level of analysis hierarchy, the study focuses on the lexical and phonetical status of the analyzed corpus based on frequency spectrums, estimated population size, coverage ratio, goodness of fit of Large Number of Rare Events (LNRE) model and transition patterns. The experimental observations provide necessary insights on sufficiency of the analyzed corpus with respect to the travel domain as well as for building basic components of English-Bengali S2ST system.
机译:日文-英文对齐的基本旅行表达语料库(BTEC)在相关的先前研究中已用作开发现实世界的语音到语音翻译(S2ST)系统的基本数据集。本文对孟加拉语翻译的BTEC文本及其语音转录进行了详细的统计分析,以开发旅游领域的英语-孟加拉语语音翻译应用程序。在不同层次的分析层次上,研究基于频谱,估计的人口规模,覆盖率,大量稀有事件(LNRE)模型的拟合优度和过渡模式,着重于被分析语料库的词汇和语音状态。实验观察结果提供了关于所分析语料库在旅行领域方面的充足性以及构建英语-孟加拉语S2ST系统基本组件的必要见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号