首页> 外文期刊>Expert Systems with Application >A two-step hybrid unsupervised model with attention mechanism for aspect extraction
【24h】

A two-step hybrid unsupervised model with attention mechanism for aspect extraction

机译:一种两步混合无监督模型,对各方提取的注意机制

获取原文
获取原文并翻译 | 示例

摘要

Social networking sites have a wealth of user-generated unstructured text for fine-grained sentiment analysis regarding the changing dynamics in the marketplace. In aspect-level sentiment analysis, aspect term extraction (ATE) task identifies the targets of user opinions in the sentence. In the last few years, deep learning approaches significantly improved the performance of aspect extraction. However, the performance of recent models relies on the accuracy of dependency parser and part-of-speech (POS) tagger, which degrades the performance of the system if the sentence doesn't follow the language constraints and the text contains a variety of multi-word aspect-terms. Furthermore, lack of domain and contextual information is again an issue to extract domain-specific, most relevant aspect terms. The existing approaches are not capable of capturing long term dependencies for noun phrases, which in turn fails to extract some valid aspect terms. Therefore, this paper proposes a two-step mixed unsupervised model by combining linguistic patterns with deep learning techniques to improve the ATE task. The first step uses rules-based methods to extract the single word and multi-word aspects, which further prune domain-specific relevant aspects using fine-tuned word embedding. In the second step, the extracted aspects in the first step are used as label data to train the attention-based deep learning model for aspect-term extraction. The experimental evaluation on the SemEval-16 dataset validates our approach as compared to the most recent and baseline techniques. (c) 2020 Elsevier Ltd. All rights reserved.
机译:社交网站拥有丰富的用户生成的非结构化文本,用于对市场变化的动态进行细粒度的情感分析。在方面的情感分析中,术语提取(ATE)任务标识句子中的用户意见的目标。在过去几年中,深入学习方法显着提高了方面提取的性能。但是,最近模型的性能依赖于依赖解析器和语音部分(POS)标签的准确性,这会降低系统的性能,如果句子不遵循语言约束,文本包含各种多个多个-word方面术语。此外,缺乏域和上下文信息再次提取特定于域的最相关的方面的问题。现有方法能够捕获名词短语的长期依赖性,这反过来无法提取一些有效的方面术语。因此,本文通过将语言模式与深层学习技术相结合来提出两步混合无监督模型,以改善ATE任务。第一步使用基于规则的方法来提取单个单词和多字的方面,该方面进一步使用微调字嵌入来进一步修剪域特定的相关方面。在第二步中,第一步中提取的方面被用作标签数据,以训练基于注意的深度学习模型进行梯度提取。与最新和基线技术相比,Semeval-16数据集的实验评估验证了我们的方法。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号