首页> 外文期刊>International Journal of Innovative Research in Science, Engineering and Technology >Cross Domain Opinion Mining in Synonymically Structured Database
【24h】

Cross Domain Opinion Mining in Synonymically Structured Database

机译:同义词结构数据库中的跨域意见挖掘

获取原文
           

摘要

Opinion mining aims at classifying sentiment data into polarity categories positive (or) negative.Opinion mining is the field of analyze the people’s opinions, sentiments, attitudes and emotions from written language. It has been important for many applications such as opinion summarization, opinion integration and review spam identification. On average, human process six articles per hour against the machine’s throughput of 10 per second. However, the opinion information is often unstructured and/or semi-structured data in the internet. Online product reviews are often unstructured, subjective, and hard to digest within a short time period. The main objective of our proposed work is to determine the human opinion from text written in the web page automatically. Sentiment classification aims to automatically predict sentiment polarity of users publishing product based sentiment data. Applying sentiment classifier results in poor performance because each domain using different sentiment word. In order to train a binary classifier from one or more domains we propose a method to overcome the problem of existing cross domain sentiment classification methods. First we create a synonym database for both source and target domains and perform pos tagging. A product based sentiment classification using spectral clustering algorithm to align the domain specific words from different domains into unified clusters for opinion classification is developed. Sentiment sensitivity is achieved with the help of synonym database by measuring the distributed similarity between the words. To investigate the effectiveness of our method, we have compared it with several algorithms and develop a robust and generic cross-domain sentiment classifier.
机译:观点挖掘旨在将情感数据分为正面(或负面)极性类别。观点挖掘是从书面语言分析人们的观点,情感,态度和情感的领域。对于许多应用程序来说,例如意见汇总,意见整合和垃圾邮件识别,这很重要。平均而言,人工每小时要处理6篇文章,而机器的吞吐率为每秒10篇。但是,意见信息通常是Internet中的非结构化和/或半结构化数据。在线产品评论通常是无组织的,主观的,并且很难在短时间内消化。我们提议的工作的主要目的是自动从网页上写入的文本中确定人的意见。情感分类旨在自动预测发布基于产品的情感数据的用户的情感极性。由于每个域使用不同的情感词,因此应用情感分类器会导致性能下降。为了从一个或多个域训练二进制分类器,我们提出了一种方法来克服现有跨域情感分类方法的问题。首先,我们为源域和目标域创建一个同义词数据库,并执行pos标记。开发了一种基于产品的情感分类,该算法使用频谱聚类算法将来自不同领域的特定领域单词对齐到统一的聚类中,以进行观点分类。借助同义词数据库,可以通过测量单词之间的分布相似性来实现情感敏感性。为了研究我们方法的有效性,我们将其与几种算法进行了比较,并开发了一种健壮且通用的跨域情感分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号