MDL-Based Models for Transliteration Generation

机译：基于MDL的音译模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents models for automatic transliteration of proper names between languages that use different alphabets. The models are an extension of our work on automatic discovery of patterns of etymological sound change, based on the Minimum Description Length Principle. The models for pairwise alignment are extended with algorithms for prediction that produce transliterated names. We present results on 13 parallel corpora for 7 languages, including English, Russian, and Farsi, extracted from Wikipedia headlines. The transliteration corpora are released for public use. The models achieve up to 88% on word-level accuracy and up to 99% on symbol-level F-score. We discuss the results from several perspectives, and analyze how corpus size, the language pair, the type of names (persons, locations), and noise in the data affect the performance.

机译：本文介绍了使用不同字母的语言之间适当名称的自动音译模型。根据最小描述长度原理，该模型是我们在自动发现导演模式变化模式的工作的延伸。成对对齐的模型与用于产生音译名称的预测的算法扩展。我们展示了13个平行语料库的结果7种语言，包括从维基百科标题提取的英语，俄罗斯和波斯语。音译基层被释放用于公共使用。符号级别的F分数达到88％的型号达到88％，高达99％。我们从几种角度讨论结果，分析了语料库大小，语言对，名称类型（人员，位置）和数据中的噪声影响性能。

著录项

来源
《SLSP 2013》|2013年||共12页
会议地点
作者
Javad Nouri; Lidia Pivovarova; Roman Yangarber;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 006.3/5;
关键词
MDL-Based Models; Transliteration Generation; automatic transliteration;

机译：基于MDL的模型;音译一代;自动音译;

相似文献

外文文献
中文文献
专利

1. Extraction of transliteration pairs from parallel corpora using a statistical transliteration model [J] . Lee CJ, Chang JS, Jang JSR Information Sciences: An International Journal . 2006,第1期

机译：使用统计音译模型从平行语料库中提取音译对
2. Generalized Context Modeling With Multi-Directional Structuring and MDL-Based Model Selection for Heterogeneous Data Compression [J] . Dai Wenrui, Xiong Hongkai, Wang Jia, Signal Processing, IEEE Transactions on . 2015,第21期

机译：具有多方向结构的通用上下文建模和基于MDL的异构数据压缩模型选择
3. MDL-based context-dependent subword modeling for speech recognition [J] . Koichi Shinoda, Takao Watanabe Acoustical science and technology . 2001,第2期

机译：基于MDL的上下文相关子词建模用于语音识别
4. MDL-Based Models for Transliteration Generation [C] . Javad Nouri, Lidia Pivovarova, Roman Yangarber International conference on statistical language and speech processing . 2013

机译：基于MDL的音译生成模型
5. Salience Estimation and Faithful Generation: Modeling Methods for Text Summarization and Generation [D] . Kedzie, Christopher. 2021

机译：显着估算与忠诚的一代：文本摘要模拟方法
6. An architecturally constrained model of random number generation and its application to modeling the effect of generation rate [O] . Nicholas J. Sexton, Richard P. Cooper 2014

机译：一种受体系结构约束的随机数生成模型及其在建模生成速率影响中的应用
7. MDL-based Models for Transliteration Generation [O] . Nouri, Javad, Pivovarova, Lidia, Yangarber, Roman 2013

机译：基于MDL的音译生成模型
8. Next-Generation NATO Reference Mobility Model (NRMM) Development (Developpement de la nouvella generation du modele de mobilite de reference de l'OTAN (NRMM)). [R] . Bradbury, M., Dasch, J., Gonzalez-Sanchez, R., 2018

机译：下一代北约参考移动模型（NRmm）开发（发展de la nouvella generation du modele de mobilite de reference de l'OTaN（NRmm））。

MDL-Based Models for Transliteration Generation

摘要

著录项

相似文献

相关主题

期刊订阅