【24h】

Universal Grapheme-to-Phoneme Prediction Over Latin Alphabets

机译:拉丁字母的通用音素到音素预测

获取原文

摘要

We consider the problem of inducing grapheme-to-phoneme mappings for unknown languages written in a Latin alphabet. First, we collect a data-set of 107 languages with known grapheme-phoneme relationships, along with a short text in each language. We then cast our task in the framework of supervised learning, where each known language serves as a training example, and predictions are made on unknown languages. We induce an undirected graphical model that learns phonotactic regularities, thus relating textual patterns to plausible phonemic interpretations across the entire range of languages. Our model correctly predicts grapheme-phoneme pairs with over 88% F1-measure.
机译:我们考虑在拉丁字母中写入的未知语言诱导图形到音素映射的问题。首先,我们通过已知的图形 - 音素关系收集107种语言的数据集,以及每种语言中的短文本。然后,我们在监督学习框架中投入了我们的任务,其中每个已知的语言都用作训练示例,并且对未知语言进行预测。我们诱导一个无向图形模型,用于学习致素术规则,从而将文本模式与整个语言的合理音素解释相关联。我们的模型可以正确预测具有超过88%F1测量的图形 - 音素对。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号