首页> 外文会议>IEEE International Conference on Systems, Man and Cybernetics >CRF-based Active Learning for Chinese Named Entity Recognition
【24h】

CRF-based Active Learning for Chinese Named Entity Recognition

机译:基于CRF的主动学习中文名为实体识别

获取原文

摘要

Conditional Random Fields (CRFs) have been used for many sequence Labeling tasks and got excellent results. Further, the supervised model strongly depends on the huge training data. Active learning is a different way rather than relying on a large amount random sampling. However, random sampling constructively participates in the optimal choosing training examples. Based on different query strategies, active learning can combine with other machine learning methods to reduce the annotation cost while maintaining the accuracy. This paper proposes a new active learning strategy based on Information Density (ID) integrated with CRFs for Chinese Named Entity Recognition (NER). On Sighan bakeoff 2006 MSRA NER corpus, an F1 score of 77.2% is achieved by using only 10, 000 labeled training sentences chosen by the proposed active learning strategy.
机译:条件随机字段(CRF)已被用于许多序列标记任务,并获得了出色的结果。此外,监督模型强烈取决于巨大的培训数据。主动学习是一种不同的方式,而不是依赖大量随机抽样。然而,随机抽样建设性地参与最佳选择训练示例。基于不同的查询策略,主动学习可以与其他机器学习方法相结合,以减少借助的贡献成本,同时保持准确性。本文提出了一种基于信息密度(ID)的新的主动学习策略与中文命名实体识别(ner)集成的CRF。在Sighan Bakeoff 2006 Msra Ner语料库中,使用由所提出的主动学习策略选择的仅10,000次标记的训练句来实现77.2%的F1得分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号