【24h】

Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition

机译:改进用于命名实体识别的半马尔可夫条件随机域的可伸缩性

获取原文
获取原文并翻译 | 示例

摘要

This paper presents techniques to apply semi-CRFs to Named Entity Recognition tasks with a tractable computational cost. Our framework can handle an NER task that has long named entities and many labels which increase the computational cost. To reduce the computational cost, we propose two techniques: the first is the use of feature forests, which enables us to pack feature-equivalent states, and the second is the introduction of a filtering process which significantly reduces the number of candidate states. This framework allows us to use a rich set of features extracted from the chunk-based representation that can capture informative characteristics of entities. We also introduce a simple trick to transfer information about distant entities by embedding label information into non-entity labels. Experimental results show that our model achieves an F-score of 71.48% on the JNLPBA 2004 shared task without using any external resources or post-processing techniques.
机译:本文介绍了将半CRF应用于命名实体识别任务的技术,其计算成本可控。我们的框架可以处理具有长期命名实体和许多标签的NER任务,从而增加了计算成本。为了降低计算成本,我们提出了两种技术:第一种是使用特征林,这使我们能够打包等效特征的状态,第二种是引入过滤过程,该过程可以显着减少候选状态的数量。这个框架允许我们使用从基于块的表示中提取的丰富功能集,这些功能可以捕获实体的信息特征。我们还引入了一个简单的技巧,即通过将标签信息嵌入到非实体标签中来传输有关遥远实体的信息。实验结果表明,在不使用任何外部资源或后处理技术的情况下,我们的模型在JNLPBA 2004共享任务上的F分数达到71.48%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号