首页> 外国专利> Miscategorized outlier detection using unsupervised SLM-GBM approach and structured data

Miscategorized outlier detection using unsupervised SLM-GBM approach and structured data

机译:使用无监督的SLM-GBM方法和结构数据进行错误分类的异常检测

摘要

In an example, one or more leaf category specific unsupervised statistical language model (SLM) models are trained using sample item listings corresponding to each of one or more leaf categories and structured data about the one or more leaf categories, the training including calculating an expected perplexity and a standard deviation for item listing titles. A perplexity for a title of a particular item listing is calculated and a perplexity deviation signal is generated based on a difference between the perplexity for the title of the particular item listing and the expected perplexity for item listing titles in a leaf category of the particular item listing and based on the standard deviation for item listing titles in the leaf category of the particular item listing. A gradient boosting machine (GBM) fuses the perplexity deviation signal with one or more other signals to generate a miscategorization classification score corresponding to the particular item listing.
机译:在一个示例中,使用对应于一个或多个叶类别的每个叶类别和结构化数据的示例项目列表训练一个或多个叶类别无监督的统计语言模型(SLM)模型,培训包括计算预期的培训困惑和项目列表标题的标准偏差。计算特定项目列表标题的困惑,并且基于特定项目列表标题的困惑与特定项的叶类别中的项目列表标题的预期困惑之间生成困惑偏差信号列出并根据项目列表标题的标准偏差在特定项目列表的叶类别中。梯度升压机(GBM)用一个或多个其他信号熔化困惑偏差信号,以生成与特定项目列表对应的错误分类分类评分。

著录项

  • 公开/公告号US10984023B2

    专利类型

  • 公开/公告日2021-04-20

    原文格式PDF

  • 申请/专利权人 EBAY INC.;

    申请/专利号US201816138163

  • 发明设计人 MINGKUAN LIU;

    申请日2018-09-21

  • 分类号G06F16/28;G06F16/22;G06F16/215;G06F40/211;G06F40/216;G06F40/284;

  • 国家 US

  • 入库时间 2022-08-24 18:17:10

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号