首页> 外国专利> METHOD AND SYSTEM FOR ATTRIBUTE EXTRACTION FROM PRODUCT TITLES USING SEQUENCE LABELING ALGORITHMS

METHOD AND SYSTEM FOR ATTRIBUTE EXTRACTION FROM PRODUCT TITLES USING SEQUENCE LABELING ALGORITHMS

机译:序列标签算法从产品标题中提取属性的方法和系统

摘要

Some embodiments can comprise a system comprising one or more computer processing modules and one or more non-transitory storage modules storing computing instructions configured to run on the one or more computer processing modules a perform acts of: receiving, at the one or more computer processing modules and from a third-party electronic device, a title for a product; dividing, at the one or more computer processing modules, the title into a sequence of tokens; storing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the sequence of tokens; determining, at the one or more computer processing modules and using a sequence labeling model, a type of each token of the sequence of tokens; storing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the type of each token of the sequence of tokens; encoding, at the one or more computer processing modules, each token of the sequence of tokens to indicate the type of each token of the sequence of tokens, wherein the type of each token of the sequence of tokens can comprise a BIO encoding scheme, wherein: a label B of the BIO encoding scheme can indicate a first token of a brand name; a label I of the BIO encoding scheme can indicate a subsequent token of the brand name; and a label O of the BIO encoding scheme can indicate a token that is not part of the brand name; determining, at the one or more computer processing modules, a brand name present in the title using each token of the sequence of tokens, as encoded; storing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the brand name present in the title; normalizing, at the one or more computer processing modules, the brand name present in the title to create a standardized representation of the brand name; writing, by the one or more computer processing modules onto the one or more non-transitory storage modules, the standardized representation of the brand name present in the title to an empty database entry associated with the product; and in response to a search request from a user, transmitting instructions to a user display to display a representation of the standardized representation of the brand name for each token of the sequence of tokens. Other embodiments are also disclosed herein.
机译:一些实施例可以包括一种系统,该系统包括一个或多个计算机处理模块和一个或多个非暂时性存储模块,其存储被配置为在一个或多个计算机处理模块上运行的计算指令,以执行以下动作:在一个或多个计算机处理处接收模块和来自第三方电子设备的产品标题;在一个或多个计算机处理模块处,将标题划分为一系列令牌;通过一个或多个计算机处理模块将令牌序列存储到一个或多个非暂时性存储模块上;在一个或多个计算机处理模块处并使用序列标记模型来确定令牌序列中每个令牌的类型;通过一个或多个计算机处理模块将一个令牌序列中每个令牌的类型存储到一个或多个非暂时性存储模块上;在一个或多个计算机处理模块处,对令牌序列中的每个令牌进行编码以指示令牌序列中的每个令牌的类型,其中令牌序列中的每个令牌的类型可以包括BIO编码方案,其中:BIO编码方案的标签B可以指示品牌名称的第一个标记; BIO编码方案的标签I可以指示品牌名称的后续标记; BIO编码方案的标签O可以指示不属于品牌名称的令牌;在一个或多个计算机处理模块处,使用编码后的令牌序列中的每个令牌来确定标题中存在的品牌名称;通过一个或多个计算机处理模块将一个标题中存在的商标名称存储到一个或多个非暂时性存储模块中;在一个或多个计算机处理模块处,将标题中存在的品牌名称标准化,以创建品牌名称的标准化表示;通过一个或多个计算机处理模块在一个或多个非暂时性存储模块上,将标题中存在的品牌名称的标准化表示写入与该产品相关联的空数据库条目;响应于来自用户的搜索请求,将指令发送到用户显示器以显示针对令牌序列中的每个令牌的品牌名称的标准化表示的表示。本文还公开了其他实施例。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号