Tweet2Vec: Character-Based Distributed Representations for Social Media

机译：Tweet2VEC：社交媒体的基于角色的分布式表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text from social media provides a set of challenges that can cause traditional NLP approaches to fail. Informal language, spelling errors, abbreviations, and special characters are all commonplace in these posts, leading to a prohibitively large vocabulary size for word-level approaches. We propose a character composition model, tweet2vec, which finds vector-space representations of whole tweets by learning complex, non-local dependencies in character sequences. The proposed model outperforms a word-level baseline at predicting user-annotated hashtags associated with the posts, doing significantly better when the input contains many out-of-vocabulary words or unusual character sequences. Our tweet2vec encoder is publicly available.

机译：社交媒体的文本提供了一系列挑战，可能导致传统的NLP方法失败。非正式语言，拼写错误，缩写和特殊字符在这些帖子中都是司空见惯的，导致Word级方法的过大的词汇量。我们提出了一个字符组成模型，Tweet2VEC，它通过在字符序列中学习复杂，非本地依赖项来找到整个推文的矢量空间表示。该建议的模型优于预测与帖子相关联的用户注释的HASHTAG的单词级基线，当输入包含许多词汇单词或异常字符序列时明显更好。我们的Tweet2vec编码器是公开的。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2016年|xxxiv 605 p.|共6页
会议地点
作者
Bhuwan Dhingra; Zhong Zhou; Dylan Fitzpatrick; Michael Muehl; William W. Cohen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. An Empirical Investigation on Social Media Users' Demand for Financial Information Distributed via Social Media Platforms [J] . Robert N. Marley, Neal M. Snow Journal of information systems . 2019,第2期

机译：社交媒体用户对通过社交媒体平台分发金融信息的需求的实证研究
2. Best basis for joint representation: The median of marginal best bases for low cost information exchanges in distributed signal representation [J] . Abdourrahmane M. Atto, Kavé Salamatian, Philippe Bolon Information Sciences: An International Journal . 2014,第Null期

机译：联合代表性的最佳基础：分布式信号表示中低成本信息交换的边际最佳基础的中位数
3. Social representations of marketing work: advertising workers and social media [J] . Cluley Robert, Green William European journal of marketing . 2019,第5期

机译：营销工作的社会代表：广告工作者和社交媒体
4. Tweet2Vec: Character-Based Distributed Representations for Social Media [C] . Bhuwan Dhingra, Zhong Zhou, Dylan Fitzpatrick, Annual meeting of the Association for Computational Linguistics . 2016

机译：Tweet2Vec：社交媒体的基于字符的分布式表示形式
5. Social Representations, Social Networks, and Public Relations Effects: The Consequences of Exposure to Sided Media Content in Different Interpersonal Settings [D] . Lee, Hyung Min 2011

机译：社会代表，社会网络和公共关系的影响：不同人际关系环境下接触媒体内容的后果
6. Refined distributed emotion vector representation for social media sentiment analysis [O] . Yung-Chun Chang, Wen-Chao Yeh, Yan-Chun Hsing, -1

机译：改进的分布式情感矢量表示用于社交媒体情感分析
7. Tweet2Vec: Character-Based Distributed Representations for Social Media [O] . Dhingra, Bhuwan, Zhou, Zhong, Fitzpatrick, Dylan, 2016

机译：Tweet2Vec：社交媒体的基于字符的分布式表示

Tweet2Vec: Character-Based Distributed Representations for Social Media

摘要

著录项

相似文献

相关主题

期刊订阅