首页> 外文会议>Language and Technology Conference >Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish
【24h】

Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish

机译:建立一个语气术词典和抛光前的句法加工链

获取原文

摘要

This paper introduces a new set of tools and resources for Polish which cover all the steps required to transform a raw unrestricted text into a reasonable input for a parser. This includes (1) a large-coverage morphological lexicon, developed thanks to the IPI PAN corpus as well as a lexical acquisition technique, and (2) multiple tools for spelling correction, segmentation, tokenization and named entity recognition. This processing chain is also able to deal with the XCES format both as input and output, hence allowing to improve XCES corpora such as the IPI PAN corpus itself. This allows us to give a brief qualitative evaluation of the lexicon and of the processing chain.
机译:本文介绍了一组新的工具和资源,用于抛光,涵盖将原始未限制文本转换为解析器的合理输入所需的所有步骤。这包括(1)由于IPI PAN语料库以及词汇采集技术以及用于拼写校正,分割,标记和命名实体识别的多种工具,开发了大覆盖的形态词典。该处理链还能够处理Xces格式作为输入和输出,因此允许改进Xces Corpora,例如IPI PAN语料库本身。这使我们旨在简要评估词典和加工链。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号