首页> 外文期刊>Pattern recognition letters >A composite kernel for named entity recognition
【24h】

A composite kernel for named entity recognition

机译:用于命名实体识别的复合内核

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we propose a novel kernel function for support vector machines (SVM) that can be used for sequential labeling tasks like named entity recognition (NER). Machine learning methods like support vector machines, maximum entropy, hidden Markov model and conditional random fields are the most widely used methods for implementing NER systems. The features used in machine learning algorithms for NER are mostly string based features. The proposed kernel is based on calculating a novel distance function between the string based features. In tasks like NER, the similarity between the contexts as well as the semantic similarity between the words play an important role. The goal is to capture the context and semantic information in NER like tasks. The proposed distance function makes use of certain statistics primarily derived from the training data and hierarchical clustering information. The kernel function is applied to the Hindi and biomedical NER tasks and the results are quite promising.
机译:在本文中,我们提出了一种用于支持向量机(SVM)的新颖内核功能,该功能可用于诸如命名实体识别(NER)之类的顺序标记任务。支持向量机,最大熵,隐马尔可夫模型和条件随机场等机器学习方法是用于实现NER系统的最广泛使用的方法。 NER的机器学习算法中使用的功能主要是基于字符串的功能。所提出的内核基于计算基于字符串的特征之间的新颖距离函数。在诸如NER之类的任务中,上下文之间的相似性以及单词之间的语义相似性起着重要的作用。目标是在类似NER的任务中捕获上下文和语义信息。提出的距离函数利用了某些主要从训练数据和层次聚类信息中得出的统计信息。核函数应用于印地语和生物医学NER任务,其结果是很有希望的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号