Multi-Task Word Alignment Triangulation for Low-Resource Languages

机译：低资源语言的多任务单词对齐三角剖分

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a multi-task learning approach that jointly trains three word alignment models over disjoint bitexts of three languages: source, target and pivot. Our approach builds upon model triangulation, following Wang et al., which approximates a source-target model by combining source-pivot and pivot-target models. We develop a MAP-EM algorithm that uses triangulation as a prior, and show how to extend it to a multi-task setting. On a low-resource Czech-English corpus, using French as the pivot, our multi-task learning approach more than doubles the gains in both F-and B scores compared to the interpolation approach of Wang et al. Further experiments reveal that the choice of pivot language does not significantly affect performance.

机译：我们提出了一种多任务学习方法，可以在三种语言（源，目标和支点）的不相交的bitexts上共同训练三个单词对齐模型。根据Wang等人的观点，我们的方法建立在模型三角剖分的基础上，该模型通过组合源-枢轴模型和枢轴-目标模型来近似源-目标模型。我们开发了使用三角测量作为先验的MAP-EM算法，并展示了如何将其扩展到多任务设置。在资源匮乏的捷克英语语料库中，以法语为中心，与Wang等人的插值方法相比，我们的多任务学习方法在F和B分数方面的收益增加了一倍以上。进一步的实验表明，枢纽语言的选择不会显着影响性能。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2015年|1221-1226|共6页
会议地点
作者
Tomer Levinboim; David Chiang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Multi-task Sequence Classification for Disjoint Tasks in Low-resource Languages [J] . Jarema Radom, Jan Kocoń Procedia Computer Science . 2021,第a期

机译：用于低资源语言的不相交任务的多任务序列分类
2. Feature learning for efficient ASR-free keyword spotting in low-resource languages [J] . Ewald van der Westhuizen, Herman Kamper, Raghav Menon, Computer speech and language . 2022,第Jana期

机译：特征学习以低资源语言的高效无论是无ASR的关键字拍摄
3. Loanword Identification in Low-Resource Languages with Minimal Supervision [J] . CHENGGANG MI, LEI XIE, YANNING ZHANG ACM transactions on Asian and low-resource language information processing . 2020,第3期

机译：低资源语言的借词识别，具有最小的监督
4. Multi-Task Word Alignment Triangulation for Low-Resource Languages [C] . Tomer Levinboim, David Chiang Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2015

机译：低资源语言的多任务词对齐三角测量
5. Parallel Sentence Detection in Comparable Corpora with Bilingual Word Embeddings for Low-Resource Languages [D] . Cadigan, John. 2018

机译：与低资源语言的双语单词嵌入式的同类语料中的并行句子检测
6. Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion [O] . Chenggang Mi, Shaolin Zhu, Rui Nie 2021

机译：利用数据增强和多个特征融合在低资源语言中提高笔记识别
7. Multi-Task Word Alignment Triangulation for Low-Resource Languages [O] . Tomer Levinboim, David Chiang 2015

机译：低资源语言的多任务词对齐三角测量

Multi-Task Word Alignment Triangulation for Low-Resource Languages

摘要

著录项

相似文献

相关主题

期刊订阅