【24h】

HFST-SweNER - A New NER Resource for Swedish

机译:HFST-Swener - 瑞典语新的NER资源

获取原文

摘要

Named entity recognition (NER) is a knowledge-intensive information extraction task that is used for recognizing textual mentions of entities that belong to a predefined set of categories, such as locations, organizations and time expressions. NER is a challenging, difficult, yet essential preprocessing technology for many natural language processing applications, and particularly crucial for language understanding. NER has been actively explored in academia and in industry especially during the last years due to the advent of social media data. This paper describes the conversion, modeling and adaptation of a Swedish NER system from a hybrid environment, with integrated functionality from various processing components, to the Helsinki Finite-State Transducer Technology (HFST) platform. This new HFST-based NER (HFST-SweNER) is a full-fledged open source implementation that supports a variety of generic named entity types and consists of multiple, reusable resource layers, e.g., various n-gram-based named entity lists (gazetteers).
机译:命名实体识别(ner)是一个知识密集型的信息提取任务,用于识别属于预定义类别的实体的文本提及,例如位置,组织和时间表达。对于许多自然语言处理应用,NER是一个具有挑战性,困难,但必不可少的预处理技术,特别是语言理解至关重要。由于社交媒体数据的出现,在学术界和工业中,在学术界和工业中积极探索。本文介绍了从混合环境的转换,建模和适应来自混合环境,从各种处理组件的集成功能,到赫尔辛基有限状态传感器技术(HFST)平台。基于HFSt的NER(HFST-Swener)是一个全面的开源实现,支持各种通用命名实体类型,包括多个可重用的资源层,例如,基于各种基于N-GRAM的命名实体列表(缩略者)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号