首页> 外文会议>International Conference on Discovery Science(DS 2005); 20051008-11; Singapore(SG) >Named Entity Recognition for the Indonesian Language: Combining Contextual, Morphological and Part-of-Speech Features into a Knowledge Engineering Approach
【24h】

Named Entity Recognition for the Indonesian Language: Combining Contextual, Morphological and Part-of-Speech Features into a Knowledge Engineering Approach

机译:印尼语言的命名实体识别:将上下文,词法和词性功能组合到知识工程方法中

获取原文
获取原文并翻译 | 示例

摘要

We present a novel named entity recognition approach for the Indonesian language. We call the new method InNER for Indonesian Named Entity Recognition. InNER is based on a set of rules capturing the contextual, morphological, and part of speech knowledge necessary in the process of recognizing named entities in Indonesian texts. The InNER strategy is one of knowledge engineering: the domain and language specific rules are designed by expert knowledge engineers. After showing in our previous work that mined association rules can effectively recognize named entities and outperform maximum entropy methods, we needed to evaluate the potential for improvement to the rule based approach when expert crafted knowledge is used. The results are conclusive: the InNER method yields recall and precision of up to 63.43% and 71.84%, respectively. Thus, it significantly outperforms not only maximum entropy methods but also the association rule based method we had previously designed.
机译:我们提出一种新颖的印尼语实体识别方法。我们将新方法称为Inner for Indonesian Named Entity Recognition。 InNER基于一组规则,该规则捕获在识别印度尼西亚文本中的命名实体的过程中必要的上下文,形态和部分语音知识。内在策略是知识工程之一:领域和语言特定的规则是由专业知识工程师设计的。在我们先前的工作中表明,挖掘的关联规则可以有效地识别命名实体并胜过最大熵方法后,我们需要评估使用专家制作的知识时改进基于规则的方法的潜力。结果是结论性的:InNER方法产生的召回率和精确度分别高达63.43%和71.84%。因此,它不仅明显胜过最大熵方法,而且胜过我们先前设计的基于关联规则的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号