Learning Tibetan-Chinese Cross-Lingual Word Embeddings

机译：学习藏汉跨语言单词嵌入

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The idea of Word Embedding is based on the semantic distribution hypothesis of the linguist Harris (1954), who believes that words of the same semantics are distributed in similar contexts. Learning of vector-space word embeddings is a technique of central importance in natural language processing. In recent years, cross-lingual word vectors have received more and more attention. Cross-lingual word vectors enable knowledge transfer between different languages, the most important It is this transfer that can take place between resource-rich and low-resource languages. This paper uses Tibetan and Chinese Wikipedia corpus to train monolingual word vectors, mainly using the fastText word vector training method, and the two monolingual word vectors are analyzed by CCA correlation, thus obtaining Tibetan-Chinese cross-lingual word vectors. In the experiment, we evaluated the resulting word representations on standard lexical semantic evaluation tasks and the results show that this method has a certain improvement on the semantic representation of the word vector.

机译：词嵌入的思想基于语言学家哈里斯（1954）的语义分布假设，他认为具有相同语义的词分布在相似的上下文中。向量空间词嵌入的学习是自然语言处理中至关重要的技术。近年来，跨语言单词向量越来越受到关注。跨语言单词向量可以实现不同语言之间的知识转移，最重要的是，这种转移可以发生在资源丰富的语言和资源匮乏的语言之间。本文利用藏汉维基百科语料库，主要通过fastText词向量训练方法训练单语词向量，并通过CCA相关性分析了两个单语词向量，从而得到藏汉跨语言词向量。在实验中，我们用标准词汇语义评估任务评估了所得的词表示，结果表明该方法对词向量的语义表示有一定的改进。

著录项

来源
《International Conference on Semantics, Knowledge and Grids》|2019年|49-53|共5页
会议地点
作者
Wei Ma; Hongzhi Yu; Kun Zhao; Deshun Zhao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
information retrieval; learning (artificial intelligence); natural language processing; text analysis; Web sites;

机译：信息检索;学习（人工智能）;自然语言处理;文本分析;网站;

相似文献

外文文献
中文文献
专利

1. Cross-lingual word embeddings [J] . Mordechai Ben-Menachem Computing reviews . 2021,第1期

机译：交叉词嵌入
2. Cross-lingual word embeddings. [J] . Mordechai Ben-Menachem Computing reviews . 2020,第11期

机译：交叉语言词嵌入。
3. A Survey of Cross-lingual Word Embedding Models [J] . Ruder Sebastian, Vulic Ivan, Sogaard Anders The Journal of Artificial Intelligence Research . 2019,第期

机译：跨语言嵌入模型的调查
4. Learning Tibetan-Chinese Cross-Lingual Word Embeddings [C] . Wei Ma, Hongzhi Yu, Kun Zhao, International Conference on Semantics, Knowledge and Grids . 2019

机译：学习西藏 - 中国交叉语言词嵌入
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Improving the learning of chemical-protein interactions from literature using transfer learning and specialized word embeddings [O] . P Corbett, J Boyle 2018

机译：使用转移学习和专门的词嵌入来改善文学中化学-蛋白质相互作用的学习
7. A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments [O] . Levy, Omer, Søgaard, Anders, Goldberg, Yoav 2017

机译：学习跨语言词汇的强大基础句子对齐

Learning Tibetan-Chinese Cross-Lingual Word Embeddings

摘要

著录项

相似文献

相关主题

期刊订阅