首页> 外文OA文献 >Automatic stopword generation using contextual semantics for sentiment analysis of Twitter
【2h】

Automatic stopword generation using contextual semantics for sentiment analysis of Twitter

机译:使用上下文语义为Twitter的情绪分析生成自动停用词

摘要

In this paper we propose a semantic approach to automatically identify and remove stopwords from Twitter data. Unlike most existing approaches, which rely on outdated and context-insensitive stopword lists, our proposed approach considers the contextual semantics and sentiment of words in order to measure their discrimination power. Evaluation results on 6 Twitter datasets show that, removing our semantically identified stopwords from tweets, increases the binary sentiment classification performance over the classic pre-complied stopword list by 0.42% and 0.94% in accuracy and F-measure respectively. Also, our approach reduces the sentiment classifier's feature space by 48.34% and the dataset sparsity by 1.17%, on average, compared to the classic method.
机译:在本文中,我们提出了一种语义方法来自动识别和删除Twitter数据中的停用词。与大多数现有方法依赖于过时且上下文无关的停用词列表不同,我们提出的方法考虑了单词的上下文语义和情感来衡量其区分能力。对6个Twitter数据集的评估结果表明,从推文中删除语义上已识别的停用词,与经典的预完成停用词列表相比,二元情感分类性能的准确度和F值分别提高了0.42%和0.94%。此外,与经典方法相比,我们的方法平均将情感分类器的特征空间减少了48.34%,将数据集稀疏度减少了1.17%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号