首页> 中文期刊>计算机技术与发展 >一种软/硬模板相结合的定义抽取算法

一种软/硬模板相结合的定义抽取算法

     

摘要

术语定义抽取是信息抽取研究领域的重要内容之一.文中提出了一种结合硬模板匹配和软模板匹配技术的综合术语定义自动抽取方法.文中首先使用硬模板库对待抽取文本进行了初步的定义句匹配抽取.接着,通过使用基于N元语言模型的软模板匹配模型来计算待匹配文本中每个句子与软模板之间的匹配度,并通过设定匹配得分阈值来抽取定义句或过滤掉错误召回的非定义句.实验结果表明文中的术语定义抽取方法远远优于单纯的硬模板匹配或软模板匹配方法.%Definition extraction is an important topic in the field of information extraction. It proposes a definition extraction method based on both hard pattern matching and soft pattern matching. Firstly, conduct hard matching on candidate sentences and hard patterns. Secondly, n-gram based soft pattern matching model is used to get a matching score between the candidate sentence and the soft pattern. In the second step, an upper threshold is set to recall candidate sentences with a high matching score;A lower threshold is used to rule out some wrongly-recalled sentences by hard matching. The experimental results show that the proposed definition extraction method is far superior to both pure hard pattern matching and soft pattern matching method.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号