CS-Embed at SemEval-2020 Task 9: The effectiveness of code-switched word embeddings for sentiment analysis

机译：在Semeval-2020任务9：Code-Switched Word Embeddings的有效性在Semeval-2020任务中嵌入

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The growing popularity and applications of sentiment analysis of social media posts has naturally led to sentiment analysis of posts written in multiple languages, a practice known as code-switching. While recent research into code-switched posts has focused on the use of multilingual word embeddings, these embeddings were not trained on code-switched data. In this work, we present word-embeddings trained on code-switched tweets, specifically those that make use of Spanish and English, known as Spanglish. We explore the embedding space to discover how they capture the meanings of words in both languages. We test the effectiveness of these embeddings by participating in SemEval 2020 Task 9: Sentiment Analysis on Code-Mixed Social Media Text. We utilised them to train a sentiment classifier that achieves an F-1 score of 0.722. This is higher than the baseline for the competition of 0.656, with our team (codalab username francesita) ranking 14 out of 29 participating teams, beating the baseline.

机译：社交媒体帖子的情感分析的越来越受欢迎和应用程序自然导致了用多种语言编写的帖子的情感分析，一种称为代码切换的练习。虽然最近的研究成Code-Switched Posts专注于使用多语言单词嵌入品，但这些嵌入物未在代码切换数据上培训。在这项工作中，我们展示了在代码切换推文上培训的单词嵌入，特别是那些使用西班牙语和英语的人，称为Spanglish。我们探索嵌入空间，以了解它们如何捕捉两种语言中的单词的含义。我们通过参与Semeval 2020任务9来测试这些嵌入的有效性：关于代码混合社交媒体文本的情感分析。我们利用它们训练一个致力于达到0.722的F-1得分的情感分类器。这高于竞争的基线0.656，我们的团队（Codalab Username Francesita）排名第29个参与的团队，击败基线。

著录项

来源
《International Workshop on Semantic Evaluation》|2020年|922-927|共6页
会议地点
作者
Frances A. Laureano De Leon; Florimond Gueniat; Harish Tayyar Madabushi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. An Integrated Word Embedding-Based Dual-Task Learning Method for Sentiment Analysis [J] . Yanping Fu, Yun Liu, Sheng-Lung Peng Arabian Journal for Science and Engineering. Section A, Sciences . 2020,第4期

机译：基于集成词嵌入的双任务情感分析方法
2. word2set: WordNet-Based Word Representation Rivaling Neural Word Embedding for Lexical Similarity and Sentiment Analysis [J] . Jimenez Sergio, Gonzalez Fabio A., Gelbukh Alexander, IEEE computational intelligence magazine . 2019,第2期

机译：word2set：基于词网的词表示与神经词嵌入竞争，以进行词汇相似度和情感分析
3. word2set: WordNet-Based Word Representation Rivaling Neural Word Embedding for Lexical Similarity and Sentiment Analysis [J] . Jimenez Sergio, Gonzalez Fabio A., Gelbukh Alexander, IEEE computational intelligence magazine . 2019,第2期

机译：Word2Set：基于Wordnet的字表示竞争神经词嵌入词汇相似性和情感分析
4. Reed at SemEval-2020 Task 9: Fine-Tuning and Bag-of-Words Approaches to Code-Mixed Sentiment Analysis [C] . Vinay Gopalan, Mark Hopkins International Workshop on Semantic Evaluation . 2020

机译：在Semeval-2020的任务9：微调和袋袋的编码混合情感分析方法
5. Improved GloVe Word Embedding Using Linear Weighting Scheme for Word Similarity Tasks [D] . Lu, Qinglan. 2021

机译：使用线性加权方案进行改进的手套单词嵌入单词相似性任务
6. Wide range screening of algorithmic bias in word embedding models using large sentiment lexicons reveals underreported bias types [O] . David Rozado 2020

机译：使用大型情绪词典的Word嵌入模型中的算法偏置的广泛绘制筛选揭示了额外的偏差类型
7. Quality of Word Embeddings on Sentiment Analysis Tasks [O] . Çano, Erion, Morisio, Maurizio 2017

机译：情感分析任务中词嵌入的质量

CS-Embed at SemEval-2020 Task 9: The effectiveness of code-switched word embeddings for sentiment analysis

摘要

著录项

相似文献

相关主题

期刊订阅