首页> 外文会议>International Workshop on Computational Processing of the Portuguese Language(PROPOR 2006) >SIEMêS - A Named-Entity Recognizer for Portuguese Relying on Similarity Rules
【24h】

SIEMêS - A Named-Entity Recognizer for Portuguese Relying on Similarity Rules

机译:Siemês - 一个名为Intity识别器,用于葡萄牙语依赖于相似性规则

获取原文

摘要

In this paper we describe SIEMêS, a named-entity recognition system for Portuguese that relies on a set of similarity rules to base the classification procedure. These rules try to obtain soft matches between candidate entities found in text and instances contained in a wide-scope gazetteer, and avoid the need for coding large sets of rules by exploiting lexical similarities. Using this matching procedure, SIEMêS generates a set of classification hypotheses based solely on internal evidence, which may be disambiguated in a later step by relatively simple rules based on contextual clues. We explain SIEMêS architecture and its named-entity identification and classification procedure. We also briefly discuss the results of the participation of SIEMêS in HAREM, the named-entity evaluation contest for Portuguese, and describe future work.
机译:在本文中,我们描述了葡萄牙语的命名实体识别系统,用于依赖于基于分类过程的一组相似性规则。这些规则尝试在宽范围宪报知识产权中包含的文本和实例中找到的候选实体之间获取软匹配,并避免通过利用词汇相似性编码大包规则。使用此匹配过程,Siemês仅基于内部证据生成一组分类假设,其可以在基于上下文线轮的相对简单的规则中歧义。我们解释了Siemês架构及其命名实体识别和分类程序。我们还简要介绍了葡萄牙语中暹粒的参与的结果,葡萄牙语的名为实体评估竞赛,并描述了未来的工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号