首页> 外文期刊>Information Processing & Management >Disambiguating context-dependent polarity of words: An information retrieval approach
【24h】

Disambiguating context-dependent polarity of words: An information retrieval approach

机译:消除上下文依赖的单词极性:一种信息检索方法

获取原文
获取原文并翻译 | 示例

摘要

The paper introduces PolaritySim - a novel approach to disambiguating context-dependent sentiment polarity of words. The task of resolving the polarity of a given word instance as positive or negative is addressed as an information retrieval problem. At the pre-processing stage, a vector of context features is built for each word w based on all its occurrences in the positive polarity corpus (consumer reviews with high ratings) and another vector -on its contexts in the negative polarity corpus (reviews with low ratings). Lexico-syntactic context features are automatically generated from dependency parse graphs of the sentences containing the word. These two vectors are treated as "documents", one with positive and one with negative polarity. To resolve the contextual polarity of a specific instance of the word w in a given sentence, its context feature vector is built in the same way, and is treated as the "query". An information retrieval (IR) model is then applied to calculate the similarity of the "query" to each of the two "documents", with the polarity of the best matching "document" attributed to the "query". The method uses no prior polarity sentiment lexicons or purposefully annotated training datasets. The only external resource used is a readily available corpus of user-rated reviews. Evaluation on different domains shows more effective performance compared to state-of-the-art baselines, Support Vector Machines (SVM) and Multinomial Naive Bayes (MNB) classifiers, on three out of four datasets. PolaritySim, SVM and MNB were also evaluated with an out-of-domain training corpus. The results indicate that PolaritySim is more effective and robust when used with an out-of-domain corpus compared to SVM and MNB. We conclude that an IR based approach can be an effective and robust alternative to machine learning approaches for disambiguating word-level polarity using either within-domain, or out-of-domain training corpora.
机译:本文介绍了PolaritySim-一种消除单词的上下文相关情感极性歧义的新颖方法。将给定单词实例的极性解析为正或负的任务被解决为信息检索问题。在预处理阶段,根据单词w在正极性语料库中的所有出现情况(具有较高评分的消费者评论)和另一个矢量-在单词负极性语料库中的上下文中(根据评论使用低收视率)。从包含单词的句子的依存关系分析图中自动生成词汇语法上下文特征。这两个向量被视为“文档”,一个带有正极性,另一个带有负极性。为了解决给定句子中单词w的特定实例的上下文极性,以相同方式构建其上下文特征向量,并将其视为“查询”。然后,将信息检索(IR)模型应用于计算“查询”与两个“文档”中每个文档的相似度,并将最匹配的“文档”的极性归因于“查询”。该方法不使用先前的极性情感词典或有目的注释的训练数据集。唯一使用的外部资源是用户评估的评论库。在四个数据集中的三个数据集中,与最新基准,支持向量机(SVM)和多项朴素贝叶斯(MNB)分类器相比,对不同域的评估显示出更有效的性能。 PolaritySim,SVM和MNB也通过域外训练语料库进行了评估。结果表明,与SVM和MNB相比,PolaritySim与域外语料库结合使用时更有效,更强大。我们得出结论,基于IR的方法可以作为机器学习方法的有效且强大的替代方法,从而可以使用域内或域外训练语料来消除单词级别的极性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号