首页> 外文期刊>Journal of information and computational science >Domain Term Extraction Based on Conditional Random Fields Combined with Active Learning Strategy
【24h】

Domain Term Extraction Based on Conditional Random Fields Combined with Active Learning Strategy

机译:基于条件随机场结合主动学习策略的领域术语提取

获取原文
获取原文并翻译 | 示例
       

摘要

Chinese domain term extraction is an important task in Chinese information processing, which has been used in lexicography, ontology construction and so on. This paper presents a Chinese automobile term extraction system based on CRFs (Conditional Random Fields) and active learning. Seven kinds of features are selected in the CRFs model and the experimental result shows that the precision, recall and F-score are 84.61%, 80.50% and 82.50% respectively with 5-fold cross-validation. As the supervised machine learning method needs large-scale labeled corpus and it is expensive to label corpus manually, we integrate active learning into the CRFs model. The active learning method uses the uncertainty-based sampling strategy and the experimental results show that when the percentage of the training corpus on the whole unlabeled corpus becomes 80%, the F-score is 82.49%, which is almost the same as that (82.50%) with the percentage of 100%.
机译:汉语领域术语提取是汉语信息处理中的一项重要任务,已被用于词典学,本体构建等方面。本文提出了一种基于CRF(条件随机场)和主动学习的中国汽车术语提取系统。在CRFs模型中选择了7种特征,实验结果表明,经过5倍交叉验证,其准确率,召回率和F得分分别为84.61%,80.50%和82.50%。由于有监督的机器学习方法需要大规模的标记语料库,并且人工标记语料库的成本很高,因此我们将主动学习集成到CRFs模型中。主动学习法采用基于不确定性的抽样策略,实验结果表明,当训练语料库占整个未标记语料库的百分比为80%时,F分数为82.49%,与该分数几乎相同(82.50)。 %)与100%的百分比。

著录项

  • 来源
    《Journal of information and computational science》 |2012年第7期|p.1931-1940|共10页
  • 作者单位

    School of Management Science and Engineering, Dalian University of Technology 116023 Dalian, China,School of Computer Science and Technology, Dalian University of Technology 116023 Dalian, China;

    School of Management Science and Engineering, Dalian University of Technology 116023 Dalian, China;

    School of Computer Science and Technology, Dalian University of Technology 116023 Dalian, China;

    School of Computer Science and Technology, Dalian University of Technology 116023 Dalian, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    term extraction; automobile field; CRFs; active learning;

    机译:术语提取汽车领域CRF;主动学习;
  • 入库时间 2022-08-18 02:13:10

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号