首页> 外文期刊>Language Resources and Evaluation >Sense representations for Portuguese: experiments with sense embeddings and deep neural language models
【24h】

Sense representations for Portuguese: experiments with sense embeddings and deep neural language models

机译:葡萄牙语的感觉表示:有感觉嵌入和深度神经语言模型的实验

获取原文
获取原文并翻译 | 示例
           

摘要

Sense representations have gone beyond word representations like Word2Vec, GloVe and FastText and achieved innovative performance on a wide range of natural language processing tasks. Although very useful in many applications, the traditional approaches for generating word embeddings have a strict drawback: they produce a single vector representation for a given word ignoring the fact that ambiguous words can assume different meanings. In this paper, we explore unsupervised sense representations which, different from traditional word embeddings, are able to induce different senses of a word by analyzing its contextual semantics in a text. The unsupervised sense representations investigated in this paper are: sense embeddings and deep neural language models. We present the first experiments carried out for generating sense embeddings for Portuguese. Our experiments show that the sense embedding model (Sense2vec) outperformed traditional word embeddings in syntactic and semantic analogies task, proving that the language resource generated here can improve the performance of NLP tasks in Portuguese. We also evaluated the performance of pre-trained deep neural language models (ELMo and BERT) in two transfer learning approaches: feature based and fine-tuning, in the semantic textual similarity task. Our experiments indicate that the fine tuned Multilingual and Portuguese BERT language models were able to achieve better accuracy than the ELMo model and baselines.
机译:Sense表示超出了Word2VEC,手套和FastText等词表示,并在广泛的自然语言处理任务中实现了创新性能。虽然在许多应用中非常有用,但是传统的生成Word Embeddings的方法具有严格的缺点:它们为给定的单词产生单个向量表示,忽略模糊字可以承担不同含义的事实。在本文中,我们探讨了与传统的单词嵌入不同的无安保人的感知表示,可以通过在文本中分析其上下文语义来诱导单词的不同感官。本文调查的无监督的感知表示是:感知嵌入和深度神经语言模型。我们提出了为葡萄牙语发电机嵌入的第一个实验。我们的实验表明,感觉嵌入模型(Sense2VEC)在句法和语义类比任务中表现出传统的传统词嵌入,证明了这里生成的语言资源可以提高葡萄牙语中的NLP任务的性能。我们还评估了在两个转移学习方法中进行了预先接受过的深度神经语言模型(ELMO和BERT)的性能:基于功能和微调,在语义文本相似度任务中。我们的实验表明,精细调整的多语言和葡萄牙BERT语言模型能够实现比ELMO模型和基线更好的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号