首页> 外国专利> SYSTEMS AND METHODS FOR TRANSLATING CHINESE PINYIN TO CHINESE CHARACTERS

SYSTEMS AND METHODS FOR TRANSLATING CHINESE PINYIN TO CHINESE CHARACTERS

机译:用于将汉语拼音翻译成汉字的系统和方法

摘要

Title: SYSTEMS AND METHODS FOR TRANSLATING CHINESE PINYIN TO CHINESE CHARACTERS [err]Abstract: Systems and methods to process and translate pinyin to Chinese characters and words are disclosed. A chinese language model is trained by extracting unknown character strings from Chinese inputs, e.g., documents and/or user inputs/queries,determining valid words from the unknown character strings, and generating a transition matrix based on the Chinese inputs for predicting a word string given the context. A method for translating a pinyin input generally includes generating a set of Chinese character strings from the pinyin input using a Chinese dictionary including words derived from the Chinese inputs and a languagemodel trained based on the Chinese inputs, each character string having a weight indicating the likelihood that the character string corresponds to the pinyin input. Ambiguous user input may be classified as non-pinyin or pinyin by identifying an ambiguous pinyin/non-pinyin ASCII word in the user input and analyzing the context to classify the user input.
机译:标题:用于将汉语拼音翻译成汉字的系统和方法[呃]摘要:公开了处理拼音并将其翻译成汉字和单词的系统和方法。中国人通过从中文输入(例如文档和/或用户输入/查询)中提取未知字符串来训练语言模型,从未知的字符串中确定有效的单词,并根据中文输入来生成转换矩阵根据上下文预测单词字符串。一种翻译拼音输入的方法,通常包括生成一组中文使用汉语词典从拼音输入中提取的字符串,包括从汉语输入中衍生的单词和一种语言基于中文输入的训练模型,每个字符串的权重表示该字符串的可能性对应于拼音输入。通过识别歧义的用户输入,可以将其分为非拼音或拼音用户输入中的拼音/非拼音ASCII单词,并分析上下文以对用户输入进行分类。

著录项

  • 公开/公告号SG125573A1

    专利类型

  • 公开/公告日2006-10-30

    原文格式PDF

  • 申请/专利权人 GOOGLE INC.;

    申请/专利号SG2006064257

  • 发明设计人 ZHU HUICAN;WU JUN;ZHU HONGJUN;

    申请日2005-03-16

  • 分类号G06F17/20;G10L15/00;

  • 国家 SG

  • 入库时间 2022-08-21 21:37:04

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号