...
首页> 外文期刊>Procedia Computer Science >Combining Linguistic, Semantic and Lexicon Feature for Emoji Classification in Twitter Dataset
【24h】

Combining Linguistic, Semantic and Lexicon Feature for Emoji Classification in Twitter Dataset

机译:结合语言,语义和词典功能在Twitter数据集中进行表情符号分类

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Emoji is a picture character used in social media to express emotion of a text message. With the increasing use of emoji few who study the relationship between emoji and text. Due to diversity of emoji and the similarity meaning between emoji, emoji classification task is more relative complex than common text classification task. In this paper, we build a computational model by extracted various features namely: linguistic feature, semantic feature, and lexicon feature to improve emoji classification performance. Then we train 400k tweet using two different classifiers Stochastic Gradient Descent Classifier and Logistic Regression. The experiment showed that our proposed feature using Logistic Regression outperformed the baseline.
机译:表情符号是在社交媒体中用于表达短信情感的图片字符。随着表情符号使用的增加,很少有人研究表情符号和文本之间的关系。由于表情符号的多样性和表情符号之间的相似性,表情符号分类任务比普通文本分类任务相对更复杂。在本文中,我们通过提取各种特征(即语言特征,语义特征和词典特征)来构建计算模型,以改善表情符号分类性能。然后,我们使用两个不同的分类器随机梯度下降分类器和Logistic回归训练40万条推文。实验表明,我们提出的使用Logistic回归的特征优于基线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号