首页> 中文期刊>模式识别与人工智能 >基于模式匹配的结构化信息抽取

基于模式匹配的结构化信息抽取

     

摘要

The information extraction results extracted from the semi-structured texts are coarse-grained, which results in ineffective semantic analysis. A structured information extraction method based on pattern matching is proposed. The proposed method is targeted at the web-presented semi-structured texts, and the suitable lexicon is loaded through domain recognition of the coarse-grained extraction results. Roles are mapped to the corresponding words in the word sequence according to the part of speech of the role in the patterns. Thus, the structured information can be extracted and it provides support for the accurate semantic analysis. Experiments show more accurate extraction results can be achieved by the proposed method.%针对半结构化文本的信息抽取粒度较大,不能对抽取结果进行有效语义分析的问题,面向领域提出一种基于模式匹配的结构化信息二次抽取方法。该方法以Web文档形式呈现的半结构化文本为对象,对粗粒度抽取结果进行领域识别,根据识别结果加载相应领域词库。根据模式中各个角色的词性实现模式角色到分词序列词语的映射,从分词序列中抽取出结构化信息,为准确的语义分析提供支持。实验表明该方法能获得更准确的抽取结果。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号