首页> 外文期刊>Computational statistics & data analysis >(Psycho-)analysis of benchmark experiments: A formal framework for investigating the relationship between data sets and learning algorithms
【24h】

(Psycho-)analysis of benchmark experiments: A formal framework for investigating the relationship between data sets and learning algorithms

机译:基准实验的(心理)分析:调查数据集和学习算法之间关系的正式框架

获取原文
获取原文并翻译 | 示例
           

摘要

It is common knowledge that the performance of different learning algorithms depends on certain characteristics of the data-such as dimensionality, linear separability or sample size. However, formally investigating this relationship in an objective and reproducible way is not trivial. A new formal framework for describing the relationship between data set characteristics and the performance of different learning algorithms is proposed. The framework combines the advantages of benchmark experiments with the formal description of data set characteristics by means of statistical and information-theoretic measures and with the recursive partitioning of Bradley-Terry models for comparing the algorithms’ performances. The formal aspects of each component are introduced and illustrated by means of an artificial example. Its real-world usage is demonstrated with an application example consisting of thirteen widely-used data sets and six common learning algorithms. The Appendix provides information on the implementation and the usage of the framework within the R language.
机译:众所周知,不同学习算法的性能取决于数据的某些特征,例如维数,线性可分离性或样本大小。但是,以客观和可重复的方式正式研究这种关系并非易事。提出了一个新的形式化框架,用于描述数据集特征和不同学习算法的性能之间的关系。该框架结合了基准实验的优势,通过统计和信息理论方法对数据集特征的形式描述以及与Bradley-Terry模型的递归划分相比较的算法性能。通过一个人工示例介绍并说明了每个组件的形式方面。通过一个由13个广泛使用的数据集和6种常见学习算法组成的应用示例演示了它的实际用法。附录提供有关R语言中框架的实现和用法的信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号