首页> 外国专利> METHOD FOR DYNAMICALLY GENERATING ADDITIONAL TERMS FOR EACH MEANING OF EVERY NATURAL LANGUAGE EXPRESSION; DICTIONARY MANAGER, DOCUMENTATION GENERATOR, TERM ANNOTATOR, SEARCH SYSTEM, AND DEVICE FOR BUILDING DOCUMENT INFORMATION SYSTEM BASED ON THE METHOD

METHOD FOR DYNAMICALLY GENERATING ADDITIONAL TERMS FOR EACH MEANING OF EVERY NATURAL LANGUAGE EXPRESSION; DICTIONARY MANAGER, DOCUMENTATION GENERATOR, TERM ANNOTATOR, SEARCH SYSTEM, AND DEVICE FOR BUILDING DOCUMENT INFORMATION SYSTEM BASED ON THE METHOD

机译:动态生成每种自然语言表达方式的附加术语的方法;基于该方法的词典管理器,文档生成器,术语注释器,搜索系统以及用于构建文档信息系统的设备

摘要

The present invention relates to changing an information system comprising natural language expressions to an information system based on unit expressions of meaning, which is accompanied by functional changes for an information search system, term dictionary, documentation generator, and term converter. The accuracy of current search systems is very low. This is because natural language represents many meanings using few words. Due to the problem of expressions becoming longer and more difficult to recollect as the number of terms increases, people use a small number of terms in a repetitive manner. When unit expressions of meaning having 1 term corresponding to 1 meaning are introduced, the accuracy of a search system can approach 100%. The present invention presents a method for easily generating unit expressions of meaning, and a method for efficiently applying the generated unit expressions of meaning to documents from around the world. The method for creating unit expressions of meaning is a technique of breaking down each natural language term into the number of respective meanings thereof. Because this is a matter of a simple breakdown of terms, anyone can generate expressions. The task of applying generated terms to documents from around the world is formidable. For this task, according to the present invention, instead of changing each word that is repetitively used, alignment is performed for each word, and certain aligned word groups are simultaneously processed. Even if one word has been used several hundred billion times in documents throughout the world, there is no need to perform term conversions several hundred billion times. If the word in question has several meanings, the task of conversion can be performed simply by way of several sorting commands. Even if the repetitive use of terms does not impose a large load on term conversion, because the number of unit expressions of meaning itself is enormous, term conversion is not simple. The task of processing close to 10 billion unit expressions of meaning is daunting. A method for solving this difficulty is to equally distribute the task to a number of users. The greatest factor contributing to the ambiguity of natural language is the presence of innumerable proper nouns. These encroach on the domains of nouns, adjectives, verbs, and all other parts of speech, causing semantic confusion. While not limited to people's names, when considering proper nouns only in that context, there are over 10 billion terms in this category since the global population exceeds 6 billion. The present invention persentss a configuration in which this prodigious task is equally allotted to a countless number of users. Users having needs may perform tasks to fulfill their requirements and benefit from their work. If the users feel that term conversion is required, the users may perform term generation and term conversion tasks so that a state that is always satisfactory for users can be maintained. The present invention provides: 1) a unit expression of meaning dictionary manager that can easily generate unit expressions of meaning; and 2) a search annotator which is a means for categorizing words and converting (annotating) words belonging to a word group into unit expressions of meaning. The annotator operates as part of a search system. The alignment and search of words uses existing search system functions. Also the present invention provides 3) a unit expression of meaning converter (annotator) performing a function similar to the search annotator. The task of making a global information system based on unit expressions of meaning is an enormous endeavor. However, the problem of natural language being unclear in meaning presents a large obstacle for development in many fields. The present invention provides a basis for achieving considerable advances in the semantic web field, search system field, language translation field, and artificial intelligence field, by providing clear language thereto.
机译:本发明涉及基于意义的单位表达将包括自然语言表达的信息系统改变为信息系统,其伴随着信息搜索系统,术语词典,文档生成器和术语转换器的功能改变。当前搜索系统的准确性非常低。这是因为自然语言使用很少的单词即可代表多种含义。由于表达的问题随着术语数量的增加而变得越来越长并且更难于收集,因此人们以重复的方式使用少量的术语。当引入具有与1个含义相对应的1个术语的含义的单位表达时,搜索系统的准确性可以达到100%。本发明提出了一种用于容易地生成含义的单位表达的方法,以及一种将所生成的含义的单位表达有效地应用于来自世界各地的文档的方法。用于创建含义的单位表达的方法是一种将每个自然语言术语分解成其各自含义的数量的技术。因为这是一个简单的术语分解问题,所以任何人都可以生成表达式。将生成的术语应用于来自世界各地的文档的任务非常艰巨。对于该任务,根据本发明,代替改变重复使用的每个单词,对每个单词执行对齐,并且同时处理某些对齐的单词组。即使一个单词在全世界的文档中已经使用了数千亿次,也不需要执行数千亿次的术语转换。如果所讨论的单词具有多种含义,则可以简单地通过几个排序命令来执行转换任务。即使重复使用术语不会给术语转换带来很大的负担,但由于含义本身的单位表达式数量众多,术语转换也不是那么简单。处理接近100亿个单位含义的任务是艰巨的。解决此困难的一种方法是将任务平均分配给多个用户。造成自然语言歧义的最大因素是存在无数专有名词。这些会侵占名词,形容词,动词和所有其他词性的域,从而引起语义混乱。尽管不限于人们的名字,但仅在这种情况下考虑专有名词时,由于全球人口超过60亿,因此该类别中的术语超过100亿。本发明提出一种配置,在该配置中,这项艰巨的任务被同等地分配给了无数的用户。有需求的用户可以执行任务来满足他们的需求并从他们的工作中受益。如果用户认为需要进行术语转换,则用户可以执行术语生成和术语转换任务,从而可以保持始终令用户满意的状态。本发明提供:1)含义字典管理器的单位表达,其可以容易地生成含义的单位表达; 2)搜索注释器,该搜索注释器是用于对单词进行分类并将属于单词组的单词转换(注释)为含义的单位表达的手段。注释器作为搜索系统的一部分运行。单词的对齐和搜索使用现有的搜索系统功能。本发明还提供3)执行类似于搜索注释器的功能的含义转换器(注释器)的单位表达。建立基于意义的单位表达的全球信息系统的任务是巨大的努力。然而,自然语言在含义上不清楚的问题为许多领域的发展提出了很大的障碍。通过提供清晰的语言,本发明提供了在语义网领域,搜索系统领域,语言翻译领域和人工智能领域中取得相当大的进步的基础。

著录项

  • 公开/公告号WO2011155736A9

    专利类型

  • 公开/公告日2012-06-21

    原文格式PDF

  • 申请/专利权人 PARK DONG MIN;

    申请/专利号WO2011KR04113

  • 发明设计人 PARK DONG MIN;

    申请日2011-06-06

  • 分类号G06F17/20;G06F17/21;G06F17/30;

  • 国家 WO

  • 入库时间 2022-08-21 17:18:39

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号