首页> 外国专利> Compound word segmentation device and Japanese dictionary creation device

Compound word segmentation device and Japanese dictionary creation device

机译:复合分词装置和日语词典创建装置

摘要

PROBLEM TO BE SOLVED: To provide a compound word dividing device for easily dividing a compound word with high accuracy. SOLUTION: As word dividing processing, first of all, a number of characters in the KANJI stream part of inputted words is set and a frequency information array, a word division index array and a division identifier array are cleared (step S11). Next, on the basis of a dictionary having frequency information for a pair of two KANJI characters to appear in the word head and word end of a character string, the frequency information is set to a character boundary (step S12) and a basic word division index and an affix division index are set to the character boundary (step S13). Finally, the compound word is divided to the KANJI word base of two characters and the affix (prefix or suffix) of one character by the set index (step S14).
机译:解决的问题:提供一种复合词分割装置,以容易地高精度地分割复合词。解决方案:作为分词处理,首先,在输入的词的汉字流部分中设置多个字符,并清除频率信息阵列,分词索引阵列和分界标识符阵列(步骤S11)。接下来,基于具有频率信息的字典,该频率信息用于一对两个汉字字符对出现在字符串的词首和词尾,将频率信息设置为字符边界(步骤S12)和基本词划分索引和词缀划分索引被设置到字符边界(步骤S13)。最后,通过设置的索引将复合词分为两个字符的汉字词库和一个字符的后缀(前缀或后缀)(步骤S14)。

著录项

  • 公开/公告号JP3983000B2

    专利类型

  • 公开/公告日2007-09-26

    原文格式PDF

  • 申请/专利权人 株式会社リコー;

    申请/专利号JP20010052637

  • 发明设计人 亀田 雅之;

    申请日2001-02-27

  • 分类号G06F17/27;

  • 国家 JP

  • 入库时间 2022-08-21 21:10:37

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号