首页> 外文学位 >Sentence-level sentiment tagging across different domains and genres.
【24h】

Sentence-level sentiment tagging across different domains and genres.

机译:跨不同领域和体裁的句子级情感标记。

获取原文
获取原文并翻译 | 示例

摘要

The demand for information about sentiment expressed in texts has stimulated a growing interest into automatic sentiment analysis in Natural Language Processing (NLP). This dissertation is motivated by an unmet need for high-performance domain-independent sentiment taggers and by pressing theoretical questions in NLP, where the exploration of limitations of specific approaches, as well as synergies between them, remain practically unaddressed.;Exploring the performance of supervised corpus-based approach to sentiment tagging, this study highlights the strong domain-dependence of the CBA. I present the development of LBA approaches based on general lexicons, such as WordNet, as a potential solution to the domain portability problem.;A system for sentiment marker extraction from WordNet's relations and glosses is developed and used to acquire lists for a lexicon-based system for sentiment annotation at the sentence and text levels. It demonstrates that LBA's performance across domains is more stable than that of CBA. Finally, the study proposes an integration of LBA and CBA in an ensemble of classifiers using a precision-based voting technique that allows the ensemble system to incorporate the best features of both CBA and LBA. This combined approach outperforms both base learners and provides a promising solution to the domain-adaptation problem.;The study contributes to NLP (1) by developing algorithms for automatic acquisition of sentiment-laden words from dictionary definitions; (2) by conducting a systematic study of approaches to sentiment classification and of factors affecting their performance; (3) by refining the lexicon-based approach by introducing valence shifter handling and parse tree information; and (4) by development of the combined, CBA/LBA approach that brings together the strengths of the two approaches and allows domain-adaptation with limited amounts of labeled training data.;This study focuses on sentiment tagging at the sentence level and covers four genres: news, blogs, movie reviews, and product reviews. It draws comparisons between sentiment annotation at different linguistic levels (words, sentences, and texts) and highlights the key differences between supervised machine learning methods that rely on annotated corpora (corpus-based, CBA) and lexicon-based approaches (LBA) to sentiment tagging.;
机译:对文本中表达的情绪信息的需求激起了人们对自然语言处理(NLP)中自动情绪分析的兴趣。这篇论文的动机是对高性能,独立于领域的情感标记器的需求未得到满足,以及在NLP中紧迫的理论问题,其中对具体方法的局限性及其之间的协同作用的探索实际上尚未解决。在基于语料库的情感标签监督方法的基础上,本研究强调了CBA的强大领域依赖性。我介绍了基于通用词典(例如WordNet)的LBA方法的开发,作为解决域可移植性问题的潜在解决方案。;开发了一种用于从WordNet的关系和词汇中提取情感标记的系统,并用于获取基于词典的列表句子和文本级别的情感注释系统。它证明了LBA跨域的性能比CBA更稳定。最后,研究提出了使用基于精度的投票技术将LBA和CBA集成到分类器集合中的方法,该技术可使该集合系统结合CBA和LBA的最佳功能。这种结合的方法胜过两个基础学习者,并且为域自适应问题提供了一个有希望的解决方案。该研究通过开发从字典定义中自动获取带有情感的单词的算法,为NLP(1)做出了贡献; (2)通过系统研究情感分类的方法和影响其表现的因素; (3)通过引入价移子处理和解析树信息来完善基于词典的方法; (4)通过开发结合CBA / LBA的方法,结合了这两种方法的优势,并允许在有限数量的带标签训练数据的情况下进行领域自适应。该研究的重点是句子级别的情感标记,涵盖了四个方面类型:新闻,博客,电影评论和产品评论。它在不同语言水平(单词,句子和文本)之间对情感注释进行了比较,并突出了依赖于带注释的语料库(基于语料库,CBA)和基于词典的方法(LBA)进行监督的监督机器学习方法之间的关键区别。标记。

著录项

  • 作者

    Andreevskaia, Alina.;

  • 作者单位

    Concordia University (Canada).;

  • 授予单位 Concordia University (Canada).;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 120 p.
  • 总页数 120
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号