Universal Grapheme-to-Phoneme Prediction Over Latin Alphabets

机译：拉丁字母的通用音素到音素预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider the problem of inducing grapheme-to-phoneme mappings for unknown languages written in a Latin alphabet. First, we collect a data-set of 107 languages with known grapheme-phoneme relationships, along with a short text in each language. We then cast our task in the framework of supervised learning, where each known language serves as a training example, and predictions are made on unknown languages. We induce an undirected graphical model that learns phonotactic regularities, thus relating textual patterns to plausible phonemic interpretations across the entire range of languages. Our model correctly predicts grapheme-phoneme pairs with over 88% F1-measure.

机译：我们考虑在拉丁字母中写入的未知语言诱导图形到音素映射的问题。首先，我们通过已知的图形 - 音素关系收集107种语言的数据集，以及每种语言中的短文本。然后，我们在监督学习框架中投入了我们的任务，其中每个已知的语言都用作训练示例，并且对未知语言进行预测。我们诱导一个无向图形模型，用于学习致素术规则，从而将文本模式与整个语言的合理音素解释相关联。我们的模型可以正确预测具有超过88％F1测量的图形 - 音素对。

著录项

来源
《Conference on empirical methods in natural language processing;Conference on computational natural language learning》|2012年|332-343|共12页
会议地点
作者
Young-Bum Kim; Benjamin Snyder;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet [J] . Hutter Marcus Journal of machine learning research . 2003,第Nov期

机译：一般损失和字母的通用贝叶斯序列预测的最优性
2. Processing film collections labelled in non-Latin alphabets: The Indian cinema collection [J] . Spencer Churchill Journal of digital media management . 2017,第2期

机译：处理非拉丁字母标签的电影收藏：印度电影收藏
3. A letter visual-similarity matrix for Latin-based alphabets [J] . Ian C. Simpson, Petroula Mousikou, Juan Manuel Montoya, Behavior Research Methods . 2013,第2期

机译：基于拉丁字母的字母视觉相似度矩阵
4. Universal Grapheme-to-Phoneme Prediction Over Latin Alphabets [C] . Young-Bum Kim, Benjamin Snyder Conference on empirical methods in natural language processing . 2012

机译：通过拉丁字母的通用图形到音素预测
5. Language policy and language planning in Kazakhstan: About the proposed shift from the Cyrillic alphabet to the Latin alphabet. [D] . Dotton, Zura. 2016

机译：哈萨克斯坦的语言政策和语言规划：关于从西里尔字母向拉丁字母的转换建议。
6. The transferability of handwriting skills: from the Cyrillic to the Latin alphabet [O] . Thibault Asselborn, Wafa Johal, Bolat Tleubayev, 2021

机译：手写技巧的可转让性：从西里尔的到拉丁字母
7. Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet [O] . Hutter, Marcus 2003

机译：一般损失的通用贝叶斯序列预测的最优性和字母表

Universal Grapheme-to-Phoneme Prediction Over Latin Alphabets

摘要

著录项

相似文献

相关主题

期刊订阅