首页> 外文期刊>Journal of intelligent & fuzzy systems: Applications in Engineering and Technology >Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification
【24h】

Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification

机译:使用域名分类改善无监督的神经方面提取在线讨论

获取原文
获取原文并翻译 | 示例
       

摘要

Deep learning architectures based on self-attention have recently achieved and surpassed state of the art results in the task of unsupervised aspect extraction and topic modeling. While models such as neural attention-based aspect extraction (ABAE) have been successfully applied to user-generated texts, they are less coherent when applied to traditional data sources such as news articles and newsgroup documents. In this work, we introduce a simple approach based on sentence filtering in order to improve topical aspects learned from newsgroups-based content without modifying the basic mechanism of ABAE. We train a probabilistic classifier to distinguish between out-of-domain texts (outer dataset) and in-domain texts (target dataset). Then, during data preparation we filter out sentences that have a low probability of being in-domain and train the neural model on the remaining sentences. The positive effect of sentence filtering on topic coherence is demonstrated in comparison to aspect extraction models trained on unfiltered texts.
机译:基于自我关注的深度学习架构最近实现和超越了现有技术,导致无监督的方面提取和主题建模的任务。虽然诸如基于神经关注的方面提取(ABAE)的模型已成功应用于用户生成的文本,但在应用于传统数据源之类的新闻文章和新闻组文章之类的传统数据源时,它们不太一致。在这项工作中,我们介绍了一种基于句子过滤的简单方法,以改进基于新闻组的内容中学到的主题方面而不修改ABAE的基本机制。我们训练概率分类器,区分域外文本(外部数据集)和域中文本(目标数据集)。然后,在数据准备期间,我们将滤除具有低概率的句子以及在其余句子上培训神经模型的概率。与在未过滤的文本训练的方面提取模型相比,句子过滤对主题相干性的积极效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号