首页> 外国专利> Miscategorized outlier detection using unsupervised SLM-GBM approach and structured data

Miscategorized outlier detection using unsupervised SLM-GBM approach and structured data

机译:使用非监督SLM-GBM方法和结构化数据对分类异常值进行了错误分类

摘要

In an example, one or more leaf category specific unsupervised statistical language model (SLM) models are trained using sample item listings corresponding to each of one or more leaf categories and structured data about the one or more leaf categories, the training including calculating an expected perplexity and a standard deviation for item listing titles. A perplexity for a title of a particular item listing is calculated and a perplexity deviation signal is generated based on a difference between the perplexity for the title of the particular item listing and the expected perplexity for item listing titles in a leaf category of the particular item listing and based on the standard deviation for item listing titles in the leaf category of the particular item listing. A gradient boosting machine (GBM) fuses the perplexity deviation signal with one or more other signals to generate a miscategorization classification score corresponding to the particular item listing.
机译:在一个示例中,使用对应于一个或多个叶子类别中的每一个的样本项目列表以及关于一个或多个叶子类别的结构化数据来训练一个或多个特定于叶子类别的非监督统计语言模型(SLM)模型。项目清单标题的困惑和标准偏差。基于特定项目列表的标题的困惑度与特定项目的叶子类别中的项目列表标题的预期困惑度之间的差,计算特定项目列表的标题的困惑度,并生成困惑度偏离信号。列表,并基于特定项目列表的叶类别中项目列表标题的标准差。梯度提升机(GBM)将困惑度偏差信号与一个或多个其他信号融合,以生成与特定项目列表相对应的误分类分类得分。

著录项

  • 公开/公告号US10095770B2

    专利类型

  • 公开/公告日2018-10-09

    原文格式PDF

  • 申请/专利权人 EBAY INC.;

    申请/专利号US201514861746

  • 发明设计人 MINGKUAN LIU;

    申请日2015-09-22

  • 分类号G06F17/30;G06F17/27;

  • 国家 US

  • 入库时间 2022-08-21 13:03:40

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号