首页> 外文会议>IEEE Workshop on Spoken Language Technology >Word segmentation through cross-lingual word-to-phoneme alignment
【24h】

Word segmentation through cross-lingual word-to-phoneme alignment

机译:通过交叉语言致校准对齐字分割

获取原文

摘要

We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42% absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17%.
机译:我们介绍了我们的新对齐模型模型3P,用于交叉语言到音素对齐,并显示在使用另一种语言的信息时更准确的单词分割的无监督学习更准确。 具有跨语言信息的单词分割与自动语音识别的音频数据引导发音词典高度相关,绕过语音转换中的书面形式或构建未经语言的词汇,特别是在资源不足的上下文中 语言。 使用型号3p进行英语单词与西班牙语音素之间的对齐,优于BTEC语料库[2]的最先进的单声道分段方法[2]在音素级别的f-score中的绝对高达42%。 吉萨++ 基于IBM Model 3的对齐高达17%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号