首页> 外国专利> METHOD AND DEVICE FOR TRAINING LANGUAGE MODEL, METHOD AND DEVICE FOR KANA/KANJI CONVERSION, COMPUTER PROGRAM, AND COMPUTER READABLE RECORDING MEDIUM

METHOD AND DEVICE FOR TRAINING LANGUAGE MODEL, METHOD AND DEVICE FOR KANA/KANJI CONVERSION, COMPUTER PROGRAM, AND COMPUTER READABLE RECORDING MEDIUM

机译:训练语言模型的方法和装置,用于假名/汉字转换,计算机程序和计算机可读记录介质的方法和装置

摘要

PPROBLEM TO BE SOLVED: To provide a kana/kanji conversion device for converting words defined with parts of speech based on a statistical language model. PSOLUTION: In a step S7002, a computer system prepares a set (lattice) of the combination (paths) of ID where word ID and part of speech ID are mixed from inputted hiragana 704 by using a dictionary 202 and a user dictionary. In a step S7004, probability that each path occurs is extracted from a statistical language model 304, and a path/probability correlation chart 708 where each path is correlated to probability is generated. In a step S7006, the path whose probability is the highest is selected as a conversion candidate 710 from the path/probability correlation chart 708. In a step S7008, the selected path is converted into a kana/kanji character string 712 by using the dictionary 202 and a user dictionary 206. PCOPYRIGHT: (C)2004,JPO
机译:

要解决的问题:提供一种假名/汉字转换设备,用于基于统计语言模型转换用词性定义的单词。

解决方案:在步骤S7002中,计算机系统通过使用字典202和用户字典,准备ID组合(路径)的集合(晶格),其中单词ID和语音ID的一部分与输入的平假名704混合在一起。在步骤S7004中,从统计语言模型304中提取每个路径出现的概率,并且生成其中每个路径与概率相关的路径/概率相关图708。在步骤S7006中,从路径/概率相关图708中选择概率最高的路径作为转换候选710。在步骤S7008中,通过使用字典将所选路径转换为假名/日文汉字字符串712。 202和用户词典206。

版权:(C)2004,日本特许厅

著录项

  • 公开/公告号JP2004118461A

    专利类型

  • 公开/公告日2004-04-15

    原文格式PDF

  • 申请/专利权人 MICROSOFT CORP;

    申请/专利号JP20020279934

  • 发明设计人 ISHIBASHI NORIKO;KANEKI HIROAKI;

    申请日2002-09-25

  • 分类号G06F17/22;G06F17/21;

  • 国家 JP

  • 入库时间 2022-08-21 23:31:54

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号