首页> 外文会议>International Conference on Pattern Recognition and Machine Learning >How Does Chinese Segmentation Strategy Effect on Sentiment Analysis of Short Text?
【24h】

How Does Chinese Segmentation Strategy Effect on Sentiment Analysis of Short Text?

机译:中国分割策略如何对短文本的情感分析?

获取原文

摘要

In term of Chinese natural language processing, it exits one particular problem that how to choose the strategy of word segmentation, which commonly includes char-based and word-based. Targeted at sentiment analysis of short text comparing with long text, the word-based segmentation faces the other problem that there are the more ambiguous or unregistered words in context of short text. The feature extraction done by the different Chinese Word Segmentation impact the statistic distribution of features, and further the accuracy of sentiment analysis. This paper evaluates five Chinese segmentation strategy effect on Sentiment Analysis of Short Text. We chose two word-based Chinese Word Segmentation (CWS), and three char-based n-gram, then transformed Bag-of-Word (BOW) to Vector Space Model (VSM) which finally was fed into several classifiers to predict sentiment polarity of short text. To reduce the impact of corpora, the study is based a collection of five public corpora.
机译:在中文自然语言处理中,它出现了如何选择单词分割策略的一个特定问题,该策略通常包括基于Char的基于字样。 针对与长文本的短文本的情感分析,基于词的分割面临的另一个问题,即在短文本的上下文中存在更加模糊或未注册的单词。 不同汉字分割完成的特征提取影响特征的统计分布,进一步的情绪分析的准确性。 本文评估了对短文本情绪分析的五大中文分割策略影响。 我们选择了两个基于词的中文词分割(CWS)和三个基于Char的N-Gram,然后转换为Word(弓)到矢量空间模型(VSM),最终被送入几个分类器以预测情感极性 短文本。 为了减少对语料库的影响,该研究基于五个公共集团的收集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号