首页> 外文学位 >On a Generalization of the Gini Correlation for Statistical Data Mining.
【24h】

On a Generalization of the Gini Correlation for Statistical Data Mining.

机译:关于统计数据挖掘的基尼相关性的推广。

获取原文
获取原文并翻译 | 示例

摘要

Cost-sensitive learning has received growing attention in the last twenty years. It is a generalization of the traditional classification problem where costs are taken into consideration instead of simply accuracy. With the rapid development of machine-learning methods in this area, it is of great importance to have proper measures to compare various learners in order to select the optimal one. This thesis develops a new measure by generalizing a commonly used asymmetric measure in social economics, called Gini correlation. The new definition, called the generalized Gini correlation, is found to include special cases that are equivalent to common evaluation measures used in data mining, for example, the LIFT measures for a binary response and the expected profit measure for a monetary response. We consider estimation and inference regarding this generalized Gini correlation. The asymptotic distribution of the estimated correlation is derived with the help of some empirical process theory. We also propose several ways of constructing confidence intervals and demonstrate their performance numerically.
机译:在过去的20年中,对成本敏感的学习受到越来越多的关注。这是对传统分类问题的概括,其中考虑了成本而不是简单的准确性。随着这一领域机器学习方法的飞速发展,采取适当措施比较各种学习者以选择最佳学习者至关重要。本文通过归纳一种社会经济学中常用的不对称测度(称为基尼相关性)来开发一种新的测度。发现新的定义称为广义基尼相关性,它包含一些特殊情况,这些特殊情况等同于数据挖掘中使用的常见评估度量,例如,用于二元响应的LIFT度量和用于货币响应的预期利润度量。我们考虑关于这种广义基尼相关性的估计和推断。估计相关性的渐近分布是借助于一些经验过程理论得出的。我们还提出了几种构造置信区间的方法,并通过数值证明了它们的性能。

著录项

  • 作者

    Gao, Yi.;

  • 作者单位

    Northwestern University.;

  • 授予单位 Northwestern University.;
  • 学科 Statistics.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 128 p.
  • 总页数 128
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号