首页> 外国专利> SIMILARITY MODEL-BASED DATA PROCESSING METHOD AND SYSTEM

SIMILARITY MODEL-BASED DATA PROCESSING METHOD AND SYSTEM

机译:基于相似模型的数据处理方法及系统

摘要

A similarity model-based data processing method and system, which may effectively improve the conversion rate of customers at reduced costs by using similarity model-based data processing technical means. The method comprises: collecting a plurality of customer data; extracting continuous label data from each piece of customer data, and obtaining multiple groups of discrete label data after binning conversion; calculating the similarity distance for discrete factors in each group of discrete label data, while screening out multiple groups of new discrete label data consisting of discrete factors which contribute significantly; calculating the weight for the discrete factors in the new discrete label data respectively by using the random forest algorithm and the gradient boosting decision tree algorithm, and obtaining weighted results of multiple groups of discrete factors after weighted summation; and calculating the final similarity distance between each piece of customer data and positive sample data respectively by using the Manhattan distance algorithm according to the weighted result of each group of discrete factors and the similarity distance of each discrete factor.
机译:一种基于相似度模型的数据处理方法和系统,可以通过使用基于相似度模型的数据处理技术手段,以降低的成本有效提高客户的转化率。该方法包括:收集多个客户数据;以及从每条客户数据中提取连续标签数据,并在分箱转换后获得多组离散标签数据;计算每组离散标签数据中离散因子的相似距离,同时筛选出多组新的离散标签数据,这些数据由显着影响的离散因子组成;利用随机森林算法和梯度提升决策树算法分别计算新离散标签数据中离散因子的权重,加权求和后得到多组离散因子的加权结果;根据各组离散因子的加权结果和各离散因子的相似距离,分别采用曼哈顿距离算法分别计算出每条客户数据与正样本数据之间的最终相似距离。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号