首页> 中文期刊> 《西南交通大学学报》 >基于无监督学习的部分-整体关系获取

基于无监督学习的部分-整体关系获取

         

摘要

An unsupervised learning method was proposed to solve the problem of part-whole relation extraction from Chinese free texts. A subsequence extraction algorithm was firstly introduced that can acquire concept pairs and their context patterns from domain texts,and a distributional semantic model was constructed according to concept pairs and context patterns of concept pairs. Then a co-clustering algorithm was applied to group the concept pairs with the same semantic relations together. L1 regularized logistic regression model was trained to select clustering feature and obtain the context pattern which represents semantic relation of each cluster. At last,according to the patterns,the clusters expressing part-whole relation were identified and part-whole relation concept pairs were acquired. The experimental results indicate the proposed method is effective and its F measure is up to 68 . 97% which is superior to the traditional clustering (55 . 77%) and pattern matching methods (61 . 95%).%针对面向中文自由文本的部分-整体关系抽取问题,提出一种基于无监督学习的方法.首先提出子模式提取算法,从领域文本集中获取概念对和概念对所在上下文模式,利用概念对和概念对上下文模式建立分布式语义模型;然后采用协同聚类算法将具有相同语义关系的概念对聚合成簇,通过训练L1正则化逻辑回归模型提取簇的特征并得到代表每个簇语义关系的概念对上下文模式;最后根据模式识别表达部分-整体关系的簇,从而获取部分-整体关系概念对.实验结果表明,该方法取得较好的性能,F度量达到68.97%,优于传统聚类方法(55.77%)和模式匹配方法(61.95%).

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号