首页> 外国专利> INDEX-SIDE DIACRITICAL CANONICALIZATION

INDEX-SIDE DIACRITICAL CANONICALIZATION

机译:索引侧非规范规范化

摘要

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for index-side synonym expansion. One method includes obtaining a token sequence for a resource and indexing a particular token in the token sequence. The indexing includes obtaining a diacritically canonicalized form of the particular token; determining that the diacritically canonicalized form of the particular token is different from the particular token; and storing data associating the resource with both the particular token and the different diacritically canonicalized form of the particular token as index terms for the resource in a search engine.
机译:用于索引侧同义词扩展的方法,系统和装置,包括在计算机存储介质上编码的计算机程序。一种方法包括获得资源的令牌序列并在令牌序列中索引特定令牌。索引包括获得特定令牌的变音规范化形式;确定所述特定令牌的变音规范化形式不同于所述特定令牌;并且在搜索引擎中存储将资源与特定令牌和特定令牌的不同变音规范化形式相关联的数据作为资源的索引项。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号