A Word Embedding Model Learned from Political Tweets

机译：嵌入模型从政治推文中学到的一词

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Distributed word representations have recently contributed to significant improvements in many natural language processing (NLP) tasks. Distributional semantics have become amongst the important trends in machine learning (ML) applications. Word embeddings are distributed representations of words that learn semantic relationships from a large corpus of text. In the social context, the distributed representation of a word is likely to be different from general text word embeddings. This is relatively due to the unique lexical semantic features and morphological structure of social media text such as tweets, which implies different word vector representations. In this paper, we collect and present a political social dataset that consists of over four million English tweets. An artificial neural network (NN) is trained to learn word co-occurrence and generate word vectors from the political corpus of tweets. The model is 136MB and includes word representations for a vocabulary of over 86K unique words and phrases. The learned model shall contribute to the success of many ML and NLP applications in microblogging Social Network Analysis (OSN), such as semantic similarity and cluster analysis tasks.

机译：分布式字表示最近促进了许多自然语言处理（NLP）任务中的显着改进。分布语义已成为机器学习（ML）应用的重要趋势之一。 Word Embeddings是从大型文本语料库中学习语义关系的单词的分布式表示。在社交背景下，单词的分布式表示可能与常规文本单词嵌入不同。这相对较为归因于由于许多社交媒体文本的独特词汇语义特征和交换，这意味着不同的单词矢量表示。在本文中，我们收集并提出了一个由超过400万英文推文组成的政治社交数据集。培训人工神经网络（NN），以学习单词共同发生并从推文的政治语料库中生成单词向量。该模型是136MB，包括用于超过86K独特单词和短语的词汇的字表示。学习模式应在微博社交网络分析（OSN）中有助于许多ML和NLP应用程序的成功，例如语义相似性和群集分析任务。

著录项

来源
《International Conference on Computer Engineering and Systems》|2018年|703p|共7页
会议地点
作者
Noufa N. Alnajran; Keeley A. Crockett; David McLean; Annabel Latham;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Twitter; Computational modeling; Semantics; Training; Data collection; Task analysis; Context modeling;

机译：Twitter;计算建模;语义;培训;数据收集;任务分析;上下文建模;

相似文献

外文文献
中文文献
专利

1. Enhancing Contextualised Language Models with Static Character and Word Embeddings for Emotional Intensity and Sentiment Strength Detection in Arabic Tweets [J] . Abdullah I. Alharbi, Phillip Smith, Mark Lee Procedia Computer Science . 2021,第a期

机译：增强具有静态字符和Word Embeddings的语境化语言模型，用于阿拉伯语推文中的情绪强度和情绪强度检测
2. Bag of Embedding Words for Sentiment Analysis of Tweets [J] . Galvez Arias Pierina, Guzman Ramos Pedro Jesús, Chipana Vila Luis Antonio, Journal of Computers . 2019,第3期

机译：嵌入词袋以分析推文
3. Words with Consistent Diachronic Usage Patterns are Learned Earlier: A Computational Analysis Using Temporally Aligned Word Embeddings [J] . Cassani Giovanni, Bianchi Federico, Marelli Marco Cognitive Science . 2021,第4期

机译：早期学习具有一致历时使用模式的单词：使用时间上对齐的单词嵌入的计算分析
4. A Word Embedding Model Learned from Political Tweets [C] . Noufa N. Alnajran, Keeley A. Crockett, David McLean, International Conference on Computer Engineering and Systems . 2018

机译：从政治推文中学到的词嵌入模型
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Identifying tweets of personal health experience through word embedding and LSTM neural network [O] . Keyuan Jiang, Shichao Feng, Qunhao Song, 2018

机译：通过词嵌入和LSTM神经网络识别个人健康经验的推文
7. DataSEARCH at IEST 2018: Multiple Word Embedding based Models for Implicit Emotion Classification of Tweets with Deep Learning [O] . Yasas Senarath, Uthayasanker Thayasivam 2018

机译：DataSearch在iest 2018：嵌入基于Word的模型，了解深度学习的推文的隐含情绪分类

A Word Embedding Model Learned from Political Tweets

摘要

著录项

相似文献

相关主题

期刊订阅