Using Word Embeddings for Bilingual Unsupervised WSD

机译：将单词嵌入用于双语无监督WSD

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Unsupervised Word Sense Disambiguation (WSD) is one of the challenging problems in natural language processing. Recently, an unsupervised bilingual WSD approach has been proposed. This approach uses context aware EM formulation for estimating the sense distribution by using the co-occurrence counts of cross-linked words in comparable corpora. WordNet-based similarity measures are used for approximating the co-occurrence counts. In this paper, we explore the feasibility of the use of Word Embeddings for approximating these counts, which is an extension to the existing approach. We evaluated our approach for Hindi-Marathi language pair, on Health domain. On using the combination of Word Embeddings and WordNet-based similarity measures, we observed 8.5% and 2.5% improvement in the F-score of verbs and adjectives respectively for Marathi and 7% improvement in the F-score of adjectives for Hindi. The experiments show that the combination of Word Embeddings and WordNet-based similarity measures is a reasonable approximation for the bilingual WSD.

机译：无监督词义歧义消除（WSD）是自然语言处理中的难题之一。近来，已经提出了一种无监督的双语WSD方法。此方法使用上下文感知的EM公式，通过使用可比语料库中交联单词的共现计数来估计有义分布。基于WordNet的相似性度量用于近似共现计数。在本文中，我们探索了使用词嵌入来近似计算这些计数的可行性，这是对现有方法的扩展。我们在健康领域评估了针对印地语-马拉地语对的方法。通过结合使用词嵌入和基于WordNet的相似性度量，我们观察到动词和形容词的F分数对马拉地语的改善分别为8.5％和2.5％，印地语的形容词的F分数分别为7％和7％的改善。实验表明，单词嵌入和基于WordNet的相似性度量的组合是双语WSD的合理近似值。

著录项

来源
《International conference on natural language processing》|2015年|55-60|共6页
会议地点
作者
Sudha Bhingardive; Dhirendra Singh; Rudramurthy V; Pushpak Bhattacharyya;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Unsupervised Word-Sense Disambiguation Using Bilingual Comparable Corpora [J] . Hiroyuki KAJI, Yasutsugu MORIMOTO IEICE Transactions on Information and Systems . 2005,第2期

机译：使用双语可比语料库的无监督词义消歧
2. Adaptive cross-contextual word embedding for word polysemy with unsupervised topic modeling [J] . Li Shuangyin, Pan Rong, Luo Haoyu, Knowledge-Based Systems . 2021,第Apra22期

机译：与无监督主题建模的自适应交叉上下文词嵌入Word Polysemy
3. Unsupervised Word Segmentation and Lexicon Discovery Using Acoustic Word Embeddings [J] . Herman Kamper, Aren Jansen, Sharon Goldwater Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第4期

机译：使用声词嵌入的无监督分词和词典发现
4. Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation [C] . Haipeng Sun, Rui Wang, Kehai Chen, Annual meeting of the Association for Computational Linguistics . 2019

机译：无监督神经机器翻译的无监督双语词嵌入协议
5. Parallel Sentence Detection in Comparable Corpora with Bilingual Word Embeddings for Low-Resource Languages [D] . Cadigan, John. 2018

机译：与低资源语言的双语单词嵌入式的同类语料中的并行句子检测
6. Words in the bilingual brain: an fNIRS brain imaging investigation of lexical processing in sign-speech bimodal bilinguals [O] . Ioulia Kovelman, Mark H. Shalinsky, Melody S. Berens, 2014

机译：双语大脑中的单词：符号语音双峰双语者中词汇处理的fNIRS脑成像研究
7. Unsupervised Joint Training of Bilingual Word Embeddings [O] . Benjamin Marie, Atsushi Fujita 2019

机译：无人监督的双语词嵌入式联合培训

Using Word Embeddings for Bilingual Unsupervised WSD

摘要

著录项

相似文献

相关主题

期刊订阅