首页> 外国专利> APPARATUS AND METHOD FOR RECOGNIZING THE NAMED ENTITY USING BACKOFF N-GRAM FEATURES

APPARATUS AND METHOD FOR RECOGNIZING THE NAMED ENTITY USING BACKOFF N-GRAM FEATURES

机译:使用退避N-GRAM特征识别命名实体的装置和方法

摘要

A device and a method for recognizing an entity name by using a back-off n-gram quality are provided to solve a problem caused by scarcity of data and guarantee reliability for recognizing the entity name of an unregistered or low frequency word. A morphologically analyzed trained corpus(41) is made by passing the corpus through a lexical analyzer(20) and is inputted to a quality information extractor(33). The quality information extractor processes a morphologically analyzed input sentence(42) inputted from the outside through a sentence input unit and the lexical analyzer. The quality information extractor provides training quality(43) to be used in a model trainer(34) by extracting the quality for the morphologically analyzed training corpus based on the back-off n-gram quality. The quality information extractor provides test quality(44) to be used in a candidate entity name extractor by using the quality for the morphologically analyzed input sentence .
机译:提供了一种通过使用退避n元语法质量来识别实体名称的设备和方法,以解决由数据不足引起的问题,并确保用于识别未注册或低频词的实体名称的可靠性。通过使语料库经过词法分析器(20)而形成经形态分析的训练语料库(41),并将其输入到质量信息提取器(33)。质量信息提取器处理通过句子输入单元和词法分析器从外部输入的形态分析的输入句子(42)。质量信息提取器通过基于后退n-gram质量提取用于形态分析的训练语料库的质量,从而提供要在模型训练器(34)中使用的训练质量(43)。质量信息提取器通过使用形态分析的输入句子的质量来提供要在候选实体名称提取器中使用的测试质量(44)。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号