首页> 中文期刊> 《计算机与现代化》 >特定领域的命名实体识别方法的研究

特定领域的命名实体识别方法的研究

         

摘要

在特定领域的命名实体识别技术中,针对不同领域有各种不同的识别方法.不同领域文本具有其独特的文本特征,这导致已有领域的识别方法难以适应新的特定领域.针对该问题,提出一种基于条件随机场、半监督学习和主动学习相结合的方法,将其形成一个统一的技术框架来适应各个特定领域的命名实体识别.该方法首先选取特定文本的基本通用特征构建特征集合,训练条件随机场对特定领域进行命名实体的初步识别,再通过主动选取置信度低于选定阈值的样本进行人工标注,并迭代扩展训练样本来达到高识别效果.为验证所提方法,针对轨道交通领域文本进行了实验,实验结果表明该方法行之有效,在轨道交通领域取得了较好的识别效果.%For named entity recognition technology in a specific domain,there are various identification methods corresponding to different fields.Different fileds of texts have their own unique textual features,which leads to the existing identification method is difficult to adapt to new specific domain.In order to solve this problem,this paper proposes a method based on conditional ran-dom field,semi-supervised learning and active learning,which forms a unified technical framework to adapt to the named entity recognition in each specific domain.This method constructs the feature set based on characteristics of rail transit text,then trains CRF to recognize named-entity of rail traffic text,and selects the samples with lower confidence level than the selected threshold, and then manually extends the training samples to achieve high goals.In order to validate the method, this paper carries on the experiment in the field of rail transit.The experimental results show that the method is effective and has a good recognition effect in the field of rail transit.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号