首页> 外文OA文献 >Multilingual number transcription for text-to-speech conversion
【2h】

Multilingual number transcription for text-to-speech conversion

机译:多语言数字转录,可实现文本到语音的转换

摘要

This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem.
机译:本文介绍了文本到语音完全可训练的转换系统的文本规范化模块及其在数字转录中的应用。主要目标是基于数据而不是专家规则来生成独立于语言的文本规范化模块。本文提出了一种基于统计机器翻译技术的通用体系结构。该提议由三个主要模块组成:用于将文本输入拆分为令牌图的令牌生成器,用于令牌翻译的基于短语的翻译模块以及用于删除某些令牌的后处理模块。已对该体系结构的多种语言的数字转录进行了评估:英语,西班牙语和罗马尼亚语。数字转录是文本规范化问题中的重要方面。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号