首页> 外国专利> EFFICIENT FACTOR ANALYSIS ON LARGE DATASETS USING CATEGORICAL VARIABLES

EFFICIENT FACTOR ANALYSIS ON LARGE DATASETS USING CATEGORICAL VARIABLES

机译:基于分类变量的大型数据集的有效因子分析

摘要

Methods and apparatus are disclosed for efficient factor analysis of a large population of data records, using factors that are categorical variables. Computation is balanced between extracting key factors by training a machine learning classifier on a reduced sample of data records, for computational efficiency, and scoring the categorical values of the key factors on the entire population, for accuracy of results. A joint factor is constructed by combining all proposed root factors, and the sample is generated by stratified sampling on the joint factor. The key factors are selected from candidate factors which can be combinations of the root factors. Original variables of a dataset, whether categorical or not, can be binned to obtain new categorical factors. Variations and user interfaces are also disclosed.
机译:公开了用于使用分类变量的因素的大量数据记录的有效因素分析的方法和装置。 通过在减少数据记录样本的减少的数据记录样本中训练机器学习分类器来提取关键因素之间的计算平衡,以进行计算效率,并为整个人口的关键因素进行评分,以获得结果的准确性。 通过组合所有提出的根因子来构建联合因子,并通过对关节因子的分层采样产生样品。 关键因素选自候选因子,这些因素可以是根因子的组合。 数据集的原始变量,无论是分类与否,都可以收集以获得新的分类因素。 还公开了变化和用户界面。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号