首页> 外文会议>Pacific-Asia conference on knowledge discovery and data mining >Probabilistic Matrix Factorization Leveraging Contexts for Unsupervised Relation Extraction
【24h】

Probabilistic Matrix Factorization Leveraging Contexts for Unsupervised Relation Extraction

机译:概率矩阵分解利用无监督关系提取的背景

获取原文

摘要

The clustering of the semantic relations between entities extracted from a corpus is one of the main issues in unsupervised relation extraction (URE). Previous methods assume a huge corpus because they have utilized frequently appearing entity pairs in the corpus. In this paper, we present a URE that works well for a small corpus by using word sequences extracted as relations. The feature vectors of the word sequences are extremely sparse. To deal with the sparseness problem, we take the two approaches: dimension reduction and leveraging context in the whole corpus including sentences from which no relations are extracted. The context in this case is captured with feature co-occurrences, which indicate appearances of two features in a single sentence. The approaches are implemented by a probabilistic matrix factorization that jointly factorizes the matrix of the feature vectors and the matrix of the feature co-occurrences. Experimental results show that our method outperforms previously proposed methods.
机译:从语料库中提取的实体之间的语义关系的聚类是无监督关系提取(URE)中的主要问题之一。以前的方法假设了一个巨大的语料库,因为它们已经在语料库中使用了频繁出现的实体对。在本文中,我们通过使用作为关系提取的单词序列来展示一份适用于小型语料库的URE。单词序列的特征向量非常稀疏。要处理稀疏问题,我们采取两种方法:在整个语料库中减少和利用背景,包括从中没有提取任何关系的句子。这种情况下的上下文是用特征共同发生捕获的,这表示单句中两个功能的外观。该方法是由概率矩阵分解来实现,该概率矩阵分解,其共同分解特征向量的矩阵和特征共同发生的矩阵。实验结果表明,我们的方法优于先前提出的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号