首页> 外国专利> Method and system for morphologizing text

Method and system for morphologizing text

机译:文本形态化的方法和系统

摘要

A method and system for morphologizing written or printed texts, including Japanese texts are obtained in accordance to codes. The longest morphemes are divided one at a time from the characters in a sentence. This is achieved by forming the longest morpheme from the remaining characters in the sentence which is listed in a dictionary of valid morphemes and determining if it is conjunctive with the previously divided morpheme. To determine if a formed morpheme is conjunctive, associated pairs of front and back connection codes are retrieved. If a front connection code of one retrieved pair and a back connection code of a pair of connection codes of the previously divided morpheme are co- listed in a table of permissible relationships, the formed morpheme is conjunctive. If no character may be divided from the remaining characters in the sentence, a previously divided morpheme is redivided. If a morpheme can be divided and is conjunctive with the previous morpheme, a connection action, describing the relationship between the formed morpheme and the previously divided morpheme, is recorded. In response to certain connection actions, the next morpheme is divided by forming it from a single character of the remaining characters and testing it. After all of the morphemes are divided, a word graph is constructed from the morphemes in accordance with the connection actions relating adjacent morphemes.
机译:根据代码获得一种用于对包括日语文本在内的书面或印刷文本进行形态化的方法和系统。最长的语素一次从句子中的字符中被分开。这是通过在有效词素词典中列出的句子中其余字符形成最长的词素,并确定其是否与先前划分的词素结合而成的。为了确定所形成的语素是否是连词,检索相关的前后连接码对。如果在允许的关系表中共同列出了一个检索到的对的前连接码和一对先前连接的词素的连接码的后连接码,则形成的词素是合取的。如果没有一个字符可以与句子中的其余字符分开,则重新划分先前的词素。如果一个语素可以被分割并与先前的语素结合,则记录描述所形成的语素与先前被分割的语素之间的关系的连接动作。响应某些连接动作,通过从其余字符中的单个字符形成下一个语素并对其进行测试,来划分下一个语素。在将所有语素分解之后,根据与相邻语素有关的连接动作,从语素中构造一个词图。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号