首页> 外文会议>International conference on computational linguistics;COLING-96 >Efficient Integrated Tagging of Word Constructs
【24h】

Efficient Integrated Tagging of Word Constructs

机译:单词构造的高效集成标记

获取原文

摘要

We describe a robust text-handling component, which can deal with free text in a wide range of formats and can successfully identify a wide range of phenomena, including chemical formulae, dates, numbers and proper nouns. The set of regular expressions used to capture numbers in written form ("sech-sundzwanzig") in German is given as an example. Proper noun "candidates" are identified by means of regular expressions, these being then rejected or accepted on the basis of run-time interaction with the user. This tagging component is integrated in a large-scale grammar development environment, and provides direct input to the grammatical analysis component of the system by means of "lift" rules which convert tagged text into partial linguistic structures.
机译:我们描述了一个健壮的文本处理组件,该组件可以处理多种格式的自由文本,并且可以成功识别多种现象,包括化学式,日期,数字和专有名词。例如,给出了一组用于捕获德语书面形式的数字(“ sech-sundzwanzig”)的正则表达式。专有名词“候选”通过正则表达式进行标识,然后根据与用户的运行时交互来拒绝或接受这些正则表达式。该标记组件集成在大规模的语法开发环境中,并通过“提升”规则将系统转换为部分语言结构的“提升”规则,为系统的语法分析组件提供直接输入。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号