Word segmentation through cross-lingual word-to-phoneme alignment

机译：通过跨语言单词到音素对齐进行单词分割

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42% absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17%.

机译：我们提出了用于跨语言单词到音素对齐的新对齐模型Model 3P，并表明当使用另一种语言的信息时，无监督学习的单词分割更为准确。具有跨语言信息的分词与自动语音识别的音频数据中的自举发音词典高度相关，可以绕过语音到语音翻译的书面形式，或者构建看不见的语言的词汇，尤其是在资源匮乏的情况下语言。使用Model 3P进行英语单词和西班牙语音素之间的对齐比BTEC语料库[2]上最先进的单语单词切分方法[1]高出F-Score绝对值达42％，基于IBM Model 3的GIZA ++对齐方式最多可提高17％。

著录项

来源
《2012 IEEE Workshop on Spoken Language Technology.》|2012年|p.85-90|共6页
会议地点 Miami FL(US);Miami FL(US)
作者
Stahlberg Felix; Schlippe Tim; Vogel Stephan; Schultz Tanja;
展开▼
作者单位

Cognitive Systems Lab, Karlsruhe Institute of Technology (KIT), Germany;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类语音信号处理;语音信号处理;
关键词
alignment model; speech-to-speech translation; under-resourced language; word segmentation;

机译：对齐模型;语音到语音翻译;资源不足的语言;分词;;

相似文献

外文文献
中文文献
专利

1. Word segmentation and pronunciation extraction from phoneme sequences through cross-lingual word-to-phoneme alignment [J] . Felix Stahlberg, Tim Schlippe, Stephan Vogel, Computer speech and language . 2016,第JANa期

机译：通过跨语言单词到音素对齐从音素序列中进行单词分割和发音提取
2. Discriminative Word Alignment over Multiple Word Segmentations [J] . XI Ning, DAI Xinyu, HUANG Shujian, 电子学报：英文版 . 2014,第002期

机译：多个单词分段上的辨别词对齐
3. Cross-lingual numerical distance priming with second-language number words in native- to third-language number word translation [J] . Duyck W, Depestel I, Fias W, The quarterly journal of experimental psychology: QJEP . 2008,第9期

机译：母语到第三语言数词翻译中使用第二语言数词的跨语言数值距离启动
4. Word segmentation through cross-lingual word-to-phoneme alignment [C] . Stahlberg Felix, Schlippe Tim, Vogel Stephan, IEEE Workshop on Spoken Language Technology . 2012

机译：通过交叉语言致校准对齐字分割
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches [O] . Sebastian Horwege, Sebastian Lindner, Marcus Boden, 2014

机译：隔开的单词和kmac：基于不精确单词匹配的快速无比对序列比较
7. Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing [O] . Tal Schuster, Ori Ram, Regina Barzilay, 2019

机译：上下文单词嵌入的交叉旋钮对齐，应用程序到零拍摄依赖解析

Word segmentation through cross-lingual word-to-phoneme alignment

摘要

著录项

相似文献

相关主题

期刊订阅