Distributional Models with Syntactic Contexts for the Measurement of Word Similarity in Brazilian Portuguese

Eduardo E. Berlitz; Denis A. Araujo; Allan B. Silva; Rodrigo R. Righi; Sandro J. Rigo

首页> 外文期刊>Journal of computer sciences >Distributional Models with Syntactic Contexts for the Measurement of Word Similarity in Brazilian Portuguese

【24h】

Distributional Models with Syntactic Contexts for the Measurement of Word Similarity in Brazilian Portuguese

机译：具有句法背景的分布模型，用于测量巴西葡萄牙语中的单词相似性

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The similarity between words constitutes significant support to tasks in natural language processing. Several works use Lexical resources such as WordNet for semantic similarity and synonym identification. Nevertheless, words out-of-vocabulary or missing links between senses are perceived problems of this approach. Distributional-based proposals like word embeddings have successfully been used to meet such problems, but the lack of contextual information can prevent the achievement of even better results. The distributional models that include contextual information can bring advantages to this area, but these models are still scarcely explored. Therefore, this work studies the advantages of incorporating syntactic information in the distributional models, fostering for better results in semantic similarity approaches. For that purpose, the current work explore existing lexical and distributional techniques regarding the measurement of word similarity in Brazilian Portuguese. Experiments were carried out with the lexical database WordNet, using different techniques over a standard dataset. The results indicate that word embeddings can cover words out of vocabulary and have better results in comparison with lexical approaches. The main contribution of this article is a new approach to apply syntactic context in the training process of word embeddings to a Brazilian Portuguese corpus. The comparison of this model with the outcome of the previous experiments shows sound results and presents relevant complementary aspects.

机译：单词之间的相似性构成了对自然语言处理中任务的重要支持。几种作品使用词汇资源，例如Wordnet进行语义相似性和同义词标识。然而，感官之间的词汇或缺少词汇的词语是感知这种方法的问题。类似于Word Embeddings的分支的提案已成功地用于满足此类问题，但缺乏上下文信息可以防止实现更好的结果。包括上下文信息的分配模型可以为该区域带来优势，但这些模型仍然几乎没有探索。因此，这项工作研究了在分布模型中纳入句法信息的优点，促进了语义相似性方法的更好结果。为此目的，目前的工作探讨了关于巴西葡萄牙语中的单词相似性的现有词汇和分布技术。使用不同技术在标准数据集中使用不同技术进行实验。结果表明，单词嵌入物可以涵盖词汇中的单词，与词汇方法相比具有更好的结果。本文的主要贡献是在嵌入式嵌入过程中申请句法背景的新方法，以至于巴西葡萄牙语语料库。该模型与先前实验结果的比较显示了声音结果并提出了相关的互补方面。

著录项

来源
《Journal of computer sciences》 |2019年第10期|共12页
作者
Eduardo E. Berlitz; Denis A. Araujo; Allan B. Silva; Rodrigo R. Righi; Sandro J. Rigo;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Word SimilarityWordNetWord EmbeddingsComputational LinguisticsNatural Language Processing;

机译：Word MateridityWordNetWorv嵌入式概念语言学语言处理;

相似文献

外文文献
中文文献
专利

1. Distributional Models with Syntactic Contexts for the Measurement of Word Similarity in Brazilian Portuguese [J] . Eduardo E. Berlitz, Denis A. Araujo, Allan B. Silva, Journal of computer sciences . 2019,第10期

机译：带有句法语境的分布模型用于巴西葡萄牙语单词相似度的测量
2. Comparing explicit and predictive distributional semantic models endowed with syntactic contexts [J] . Gamallo Pablo Language Resources and Evaluation . 2017,第3期

机译：比较具有句法上下文的显式和预测性分布语义模型
3. A semantic textual similarity measurement model based on the syntactic-semantic representation [J] . Tang Zhuo, Xiao Qi, Zhu Li, Intelligent data analysis . 2019,第4期

机译：基于语法 - 语义表示的语义文本相似性测量模型
4. The Role of Utterance Boundaries and Word Frequencies for Part-of-speech Learning in Brazilian Portuguese Through Distributional Analysis [C] . Pablo Faria Annual conference of the North American Chapter of the Association for Computational Linguistics: human language technologies;Workshop on cognitive modeling and computational linguistics . 2019

机译：话语边界和词频在巴西葡萄牙语语音学习中的分布分析作用
5. Word order in Brazilian Portuguese: A minimalist analysis [D] . Silva, Glaucia Valeria 1999

机译：巴西葡萄牙语中的单词顺序：简约分析
6. Syntactic Structural Assignment in Brazilian Portuguese-Speaking Children With Specific Language Impairment [O] . Talita Fortunato-Tavares, Claudia R. F. de Andrade, Debora M. Befi-Lopes, -1

机译：特定语言障碍的巴西葡萄牙语葡萄牙语儿童的句法结构分配
7. Finding Word Substitutions Using a Distributional Similarity Baseline and Immediate Context Overlap [O] . Aurelie Herbelot 2010

机译：使用分布相似性基线和即时上下文重叠查找单词替换

Distributional Models with Syntactic Contexts for the Measurement of Word Similarity in Brazilian Portuguese

摘要

著录项

相似文献

相关主题

期刊订阅