首页> 外国专利> Determining word boundary likelihoods in potentially incomplete text

Determining word boundary likelihoods in potentially incomplete text

机译:确定潜在不完整文本中的单词边界可能性

摘要

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining word boundary likelihoods in potentially incomplete text. In one aspect, a method includes selecting query sequences from the query, each query sequence being at least a portion of a word n-gram, the word n-gram being a subsequence of up to n words selected from the second sequence of words of the query, and for each query sequence: determining one or more query sequence keys for the query sequence; determining at least one of a word boundary count and a non-word boundary count for each query sequence key, each word-boundary count and non-word boundary count being dependent on the context of the query sequence; and associating, in a data storage device, the at least one word boundary count and non-word boundary counts with each query sequence key.
机译:方法,系统和装置,包括编码在计算机存储介质上的计算机程序,用于确定潜在不完整文本中的单词边界可能性。在一个方面,一种方法包括从查询中选择查询序列,每个查询序列是单词n-gram的至少一部分,单词n-gram是从第二个单词序列中选择的多达n个单词的子序列。查询,以及针对每个查询序列:为查询序列确定一个或多个查询序列关键字;确定每个查询序列关键字的单词边界计数和非单词边界计数中的至少一个,每个单词边界计数和非单词边界计数取决于查询序列的上下文;在数据存储设备中,将至少一个单词边界计数和非单词边界计数与每个查询序列关键字相关联。

著录项

  • 公开/公告号US8930399B1

    专利类型

  • 公开/公告日2015-01-06

    原文格式PDF

  • 申请/专利权人 GOOGLE INC.;

    申请/专利号US201313739591

  • 发明设计人 ABHINANDAN S. DAS;HARRY S. FUNG;

    申请日2013-01-11

  • 分类号G06F17/30;

  • 国家 US

  • 入库时间 2022-08-21 15:16:36

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号