首页> 外国专利> Keyword extension method and system and classification corpus annotation method and system

Keyword extension method and system and classification corpus annotation method and system

机译:关键字扩展方法和系统以及分类语料库注释方法和系统

摘要

This invention provides a keyword expansion method and system. The method comprises searching with a predetermined initial keyword to obtain current keywords used as a basis of a next search, performing loop search through keyword iteration; if a keyword error between keywords obtained in the current search and those keywords obtained in a previous search is less than a predetermined threshold, using the keywords obtained in the current search as expanded keywords of the initial keyword. With this method, the problem of manually establishing a thesaurus in the prior art may be solved. This method is a simple, accurate and efficient keyword expansion method. A method and system of automatically annotating a classified corpus is also provided. The method comprises: determining one or more initial core keywords for each class; obtaining expanded keywords for each class through expanding the initial core keywords; searching with the expanded keywords corresponding to a class to select a classified corpus and annotating the classified corpus.
机译:本发明提供了一种关键词扩展方法和系统。该方法包括:使用预定的初始关键字进行搜索以获得用作下一搜索的基础的当前关键字;通过关键字迭代执行循环搜索;以及如果当前搜索得到的关键词与前一次搜索得到的关键词之间的关键词错误小于预定阈值,则使用当前搜索得到的关键词作为初始关键词的扩展关键词。利用这种方法,可以解决现有技术中手动建立词库的问题。此方法是一种简单,准确且有效的关键字扩展方法。还提供了一种自动注释分类语料的方法和系统。该方法包括:为每个类别确定一个或多个初始核心关键词;以及通过扩展初始核心关键字来获得每个类的扩展关键字;用对应于一类的扩展关键词进行搜索,以选择分类语料,并对该分类语料进行注释。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号