首页> 外国专利> COMPOUND WORD DIVIDING DEVICE, JAPANESE DICTIONARY PREPARING DEVICE, METHOD THEREFOR, PROGRAM AND RECORDING MEDIUM

COMPOUND WORD DIVIDING DEVICE, JAPANESE DICTIONARY PREPARING DEVICE, METHOD THEREFOR, PROGRAM AND RECORDING MEDIUM

机译:复合字分割设备,日语词典准备设备,方法,程序和记录介质

摘要

PROBLEM TO BE SOLVED: To provide a compound word dividing device for easily dividing a compound word with high accuracy. SOLUTION: As word dividing processing, first of all, a number of characters in the KANJI stream part of inputted words is set and a frequency information array, a word division index array and a division identifier array are cleared (step S11). Next, on the basis of a dictionary having frequency information for a pair of two KANJI characters to appear in the word head and word end of a character string, the frequency information is set to a character boundary (step S12) and a basic word division index and an affix division index are set to the character boundary (step S13). Finally, the compound word is divided to the KANJI word base of two characters and the affix (prefix or suffix) of one character by the set index (step S14).
机译:解决的问题:提供一种复合词分割装置,以容易地高精度地分割复合词。解决方案:作为分词处理,首先,在输入的词的汉字流部分中设置多个字符,并清除频率信息阵列,分词索引阵列和分界标识符阵列(步骤S11)。接下来,基于具有频率信息的字典,该频率信息用于一对两个汉字字符对出现在字符串的词首和词尾,将频率信息设置为字符边界(步骤S12)和基本词划分索引和词缀划分索引被设置到字符边界(步骤S13)。最后,通过设置的索引将复合词分为两个字符的汉字词库和一个字符的后缀(前缀或后缀)(步骤S14)。

著录项

  • 公开/公告号JP2002259370A

    专利类型

  • 公开/公告日2002-09-13

    原文格式PDF

  • 申请/专利权人 RICOH CO LTD;

    申请/专利号JP20010052637

  • 发明设计人 KAMEDA MASAYUKI;

    申请日2001-02-27

  • 分类号G06F17/27;

  • 国家 JP

  • 入库时间 2022-08-22 01:01:33

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号