首页> 外文会议>SYNAT Workshop >Liner2-A Customizable Framework for Proper Names Recognition for Polish
【24h】

Liner2-A Customizable Framework for Proper Names Recognition for Polish

机译:Liner2-可定制的框架,用于适当的名称识别波兰语

获取原文

摘要

In the paper we present a customizable and open-source framework for proper names recognition called Liner2. The framework consists of several universal methods for sequence chunking which in clude: dictionary look-up, pattern matching and statistical processing. The statistical processing is performed using Conditional Random Fields and a rich set of features including morphological, lexical and semantic information. We present an application of the framework to the task of recognition proper names in Polish texts (5 common categories of proper names, i.e. first names, surnames, city names, road names and country names). The Liner2 framework was also used to train an extended model to recognize 56 categories of proper names which was used to bootstrap the manual annotation of KPWr corpus. We also present the CRF-based model integrated with a heterogeneous named entity similarity function. We show that the similarity function added to the best configuration improved the final result for cross-domain evaluation. The last section presents NER-WS-a web service for proper names recognition in Pol ish texts utilizing the Liner2 framework and the model for 56 categories of proper names.
机译:在论文中,我们为正确的名称识别提供了一种可自定义和开源框架,称为LineR2。该框架由几种通用方法组成,用于在Clude:字典查询,模式匹配和统计处理中的序列块。使用条件随机字段和丰富的特征集进行统计处理,包括形态,词汇和语义信息。我们向波兰语文本中的识别适当名称的任务提供了框架的应用程序(5个常见的适当名称,即名字,姓名,城市名称,道路名称和国家/地区名称)。 Liner2框架还用于培训扩展模型以识别56类的适当名称,用于引导KPWR语料库的手动注释。我们还介绍了基于CRF的模型与异构名为实体相似性功能集成。我们表明,添加到最佳配置的相似函数改善了跨域评估的最终结果。最后一节介绍了NER-WS-A Web服务,用于POL ISH文本中的正确名称识别,利用LINER2框架和56个类别的正确名称的型号。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号