Word segmentation through cross-lingual word-to-phoneme alignment

机译：通过交叉语言致校准对齐字分割

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42% absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17%.

机译：我们介绍了我们的新对齐模型模型3P，用于交叉语言到音素对齐，并显示在使用另一种语言的信息时更准确的单词分割的无监督学习更准确。具有跨语言信息的单词分割与自动语音识别的音频数据引导发音词典高度相关，绕过语音转换中的书面形式或构建未经语言的词汇，特别是在资源不足的上下文中语言。使用型号3p进行英语单词与西班牙语音素之间的对齐，优于BTEC语料库[2]的最先进的单声道分段方法[2]在音素级别的f-score中的绝对高达42％。吉萨＆＃x002b;＆＃x002b; 基于IBM Model 3的对齐高达17％。

著录项

来源
《IEEE Workshop on Spoken Language Technology》|2012年||共6页
会议地点
作者
Stahlberg Felix; Schlippe Tim; Vogel Stephan; Schultz Tanja;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
alignment model; speech-to-speech translation; under-resourced language; word segmentation;

机译：对齐模型;语音转换转换;资源不足的语言;词分割;

相似文献

外文文献
中文文献
专利

1. Word segmentation and pronunciation extraction from phoneme sequences through cross-lingual word-to-phoneme alignment [J] . Felix Stahlberg, Tim Schlippe, Stephan Vogel, Computer speech and language . 2016,第JANa期

机译：通过跨语言单词到音素对齐从音素序列中进行单词分割和发音提取
2. Discriminative Word Alignment over Multiple Word Segmentations [J] . XI Ning, DAI Xinyu, HUANG Shujian, 电子学报：英文版 . 2014,第002期

机译：多个单词分段上的辨别词对齐
3. Cross-lingual numerical distance priming with second-language number words in native- to third-language number word translation [J] . Duyck W, Depestel I, Fias W, The quarterly journal of experimental psychology: QJEP . 2008,第9期

机译：母语到第三语言数词翻译中使用第二语言数词的跨语言数值距离启动
4. Word segmentation through cross-lingual word-to-phoneme alignment [C] . Stahlberg Felix, Schlippe Tim, Vogel Stephan, 2012 IEEE Workshop on Spoken Language Technology. . 2012

机译：通过跨语言单词到音素对齐进行单词分割
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches [O] . Sebastian Horwege, Sebastian Lindner, Marcus Boden, 2014

机译：隔开的单词和kmac：基于不精确单词匹配的快速无比对序列比较
7. Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing [O] . Tal Schuster, Ori Ram, Regina Barzilay, 2019

机译：上下文单词嵌入的交叉旋钮对齐，应用程序到零拍摄依赖解析

Word segmentation through cross-lingual word-to-phoneme alignment

摘要

著录项

相似文献

相关主题

期刊订阅