Tweet2Vec: Character-Based Distributed Representations for Social Media

机译：Tweet2Vec：社交媒体的基于字符的分布式表示形式

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Text from social media provides a set of challenges that can cause traditional NLP approaches to fail. Informal language, spelling errors, abbreviations, and special characters are all commonplace in these posts, leading to a prohibitively large vocabulary size for word-level approaches. We propose a character composition model, tweet2vec, which finds vector-space representations of whole tweets by learning complex, non-local dependencies in character sequences. The proposed model outperforms a word-level baseline at predicting user-annotated hashtags associated with the posts, doing significantly better when the input contains many out-of-vocabulary words or unusual character sequences. Our tweet2vec encoder is publicly available.

机译：来自社交媒体的文本提供了一系列挑战，这些挑战可能导致传统的NLP方法失败。非正式语言，拼写错误，缩写和特殊字符在这些帖子中都很常见，导致单词级方法的词汇量过大。我们提出了一个字符组成模型tweet2vec，该模型通过学习字符序列中复杂的，非本地的依赖关系来查找整个tweet的向量空间表示。在预测与帖子相关的用户注释主题标签时，建议的模型优于单词级别的基线，当输入包含许多词汇以外的单词或不寻常的字符序列时，效果会更好。我们的tweet2vec编码器已公开提供。

著录项

来源
《Annual meeting of the Association for Computational Linguistics》|2016年|269-274|共6页
会议地点
作者
Bhuwan Dhingra; Zhong Zhou; Dylan Fitzpatrick; Michael Muehl; William W. Cohen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. An Empirical Investigation on Social Media Users' Demand for Financial Information Distributed via Social Media Platforms [J] . Robert N. Marley, Neal M. Snow Journal of information systems . 2019,第2期

机译：社交媒体用户对通过社交媒体平台分发金融信息的需求的实证研究
2. Best basis for joint representation: The median of marginal best bases for low cost information exchanges in distributed signal representation [J] . Abdourrahmane M. Atto, Kavé Salamatian, Philippe Bolon Information Sciences: An International Journal . 2014,第Null期

机译：联合代表性的最佳基础：分布式信号表示中低成本信息交换的边际最佳基础的中位数
3. Social representations of marketing work: advertising workers and social media [J] . Cluley Robert, Green William European journal of marketing . 2019,第5期

机译：营销工作的社会代表：广告工作者和社交媒体
4. Tweet2Vec: Character-Based Distributed Representations for Social Media [C] . Bhuwan Dhingra, Zhong Zhou, Dylan Fitzpatrick, Annual meeting of the Association for Computational Linguistics . 2016

机译：Tweet2VEC：社交媒体的基于角色的分布式表示
5. Social Representations, Social Networks, and Public Relations Effects: The Consequences of Exposure to Sided Media Content in Different Interpersonal Settings [D] . Lee, Hyung Min 2011

机译：社会代表，社会网络和公共关系的影响：不同人际关系环境下接触媒体内容的后果
6. Refined distributed emotion vector representation for social media sentiment analysis [O] . Yung-Chun Chang, Wen-Chao Yeh, Yan-Chun Hsing, -1

机译：改进的分布式情感矢量表示用于社交媒体情感分析
7. Tweet2Vec: Character-Based Distributed Representations for Social Media [O] . Dhingra, Bhuwan, Zhou, Zhong, Fitzpatrick, Dylan, 2016

机译：Tweet2Vec：社交媒体的基于字符的分布式表示

Tweet2Vec: Character-Based Distributed Representations for Social Media

摘要

著录项

相似文献

相关主题

期刊订阅