首页> 外国专利> Method for automatic correction of errors in annotated corpus using kernel Ripple-Down Rules

Method for automatic correction of errors in annotated corpus using kernel Ripple-Down Rules

机译:使用内核波纹下移规则自动校正带注释的语料库中的错误的方法

摘要

The present invention relates to a method of automatically modifying an error in a learning corpus for machine learning during a natural language process. According to the present invention, with existing corpus error modification methods, a user has to write a learning corpus in person for the generation of recognition and classification models, and thus, error patterns are irregular and rules for modification are not easy to make. To solve the problems, modification rules, reflecting properties of a document tagged from a correct corpus and an error corpus, are automatically generated through ripple-down rule (RDR), and an error in a learning corpus for machine learning is recognized to modify a morphological analysis corpus and an entity name corpus to minimize errors during mass production of corpuses, and moreover, properties of Korean corpuses are able to be applied through morphemic operation while a kernel is operated in an RDR system, and thus, changing only the kernel, the method is able to be applied to various tag corpuses.
机译:本发明涉及一种在自然语言过程中自动修改学习语料库中的错误以进行机器学习的方法。根据本发明,利用现有的语料库错误修改方法,用户必须亲自编写学习语料库以生成识别和分类模型,因此,错误模式是不规则的,并且修改规则不容易制定。为解决这些问题,通过波纹下降规则(RDR)自动生成反映正确语料库和错误语料库标记的文档属性的修改规则,并识别用于机器学习的学习语料库中的错误以修改形态分析语料库和实体名称语料库,以最大程度地减少语料库的批量生产过程中的错误,此外,在RDR系统中操作内核时,可以通过语素运算来应用韩语语料库的属性,从而仅更改内核,该方法能够应用于各种标签语料库。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号