首页> 外文OA文献 >Kernel-Based Ranking. Methods for Learning and Performance Estimation

【2h】

Kernel-Based Ranking. Methods for Learning and Performance Estimation

机译：基于内核的排名。学习和绩效评估方法

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Machine learning provides tools for automated construction of predictivemodels in data intensive areas of engineering and science. The family ofregularized kernel methods have in the recent years become one of the mainstreamapproaches to machine learning, due to a number of advantages themethods share. The approach provides theoretically well-founded solutionsto the problems of under- and overfitting, allows learning from structureddata, and has been empirically demonstrated to yield high predictive performanceon a wide range of application domains. Historically, the problemsof classification and regression have gained the majority of attention in thefield. In this thesis we focus on another type of learning problem, that oflearning to rank.In learning to rank, the aim is from a set of past observations to learna ranking function that can order new objects according to how well theymatch some underlying criterion of goodness. As an important special caseof the setting, we can recover the bipartite ranking problem, correspondingto maximizing the area under the ROC curve (AUC) in binary classification.Ranking applications appear in a large variety of settings, examplesencountered in this thesis include document retrieval in web search, recommendersystems, information extraction and automated parsing of naturallanguage. We consider the pairwise approach to learning to rank, whereranking models are learned by minimizing the expected probability of rankingany two randomly drawn test examples incorrectly. The developmentof computationally efficient kernel methods, based on this approach, has inthe past proven to be challenging. Moreover, it is not clear what techniquesfor estimating the predictive performance of learned models are the mostreliable in the ranking setting, and how the techniques can be implementedefficiently.The contributions of this thesis are as follows. First, we developRankRLS, a computationally efficient kernel method for learning to rank,that is based on minimizing a regularized pairwise least-squares loss. Inaddition to training methods, we introduce a variety of algorithms for taskssuch as model selection, multi-output learning, and cross-validation, basedon computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm,which is one of the most well established methods for learning torank. Third, we study the combination of the empirical kernel map and reducedset approximation, which allows the large-scale training of kernel machinesusing linear solvers, and propose computationally efficient solutionsto cross-validation when using the approach. Next, we explore the problemof reliable cross-validation when using AUC as a performance criterion,through an extensive simulation study. We demonstrate that the proposedleave-pair-out cross-validation approach leads to more reliable performanceestimation than commonly used alternative approaches. Finally, we presenta case study on applying machine learning to information extraction frombiomedical literature, which combines several of the approaches consideredin the thesis. The thesis is divided into two parts. Part I provides the backgroundfor the research work and summarizes the most central results, PartII consists of the five original research articles that are the main contributionof this thesis.

机译：机器学习提供了用于在工程和科学的数据密集型领域中自动构建预测模型的工具。由于方法共享的许多优点，近年来，正规化内核方法家族已成为机器学习的主流方法之一。该方法为欠拟合和过拟合问题提供了理论上有根据的解决方案，可以从结构化数据中学习，并且已通过经验证明在广泛的应用领域中具有很高的预测性能。从历史上看，分类和回归问题一直是该领域关注的焦点。在本文中，我们关注另一种学习问题，即学习排名。在学习排名中，目标是从一组过去的观察结果中学习一种排名函数，该函数可以根据新对象与某些基本善良标准的匹配程度对新对象进行排序。作为设置的一个重要特例，我们可以恢复二元排序问题，从而在二进制分类中最大化ROC曲线下的面积。排名应用出现在各种各样的设置中，本文所涉及的示例包括在Web中进行文档检索搜索，推荐系统，信息提取和自然语言的自动解析。我们考虑成对学习排名的方法，其中通过最小化对两个随机绘制的测试示例进行不正确排名的预期概率来学习排名模型。在过去，基于这种方法的计算有效的内核方法的开发被证明是具有挑战性的。此外，尚不清楚哪种用于估计学习模型的预测性能的技术在排名设置中最可靠，以及如何有效地实施这些技术。本文的贡献如下。首先，我们开发RankRLS，这是一种用于计算排名的计算有效内核方法，它基于最小化规则化的成对最小二乘损失。除训练方法外，我们还基于矩阵代数的计算捷径，针对模型选择，多输出学习和交叉验证引入了各种算法。其次，我们针对RankSVM算法的线性版本改进了已知最快的训练方法，这是最先进的学习排名方法之一。第三，我们研究了经验核映射与减少集近似的组合，这允许使用线性求解器对核机器进行大规模训练，并提出了使用计算方法进行交叉验证的有效计算解决方案。接下来，我们将通过广泛的仿真研究，探讨将AUC用作性能标准时可靠的交叉验证问题。我们证明，与常用的替代方法相比，提出的叶子配对出交叉验证方法可导致更可靠的性能估计。最后，我们提出了一个将机器学习应用于生物医学文献信息提取的案例研究，该研究结合了本文中考虑的几种方法。论文分为两个部分。第一部分为研究工作提供了背景，并总结了最主要的成果。第二部分由五篇原创的研究论文组成，这是本论文的主要贡献。

著录项

作者
Airola Antti;
展开▼
作者单位

展开▼
年度 2011
总页数
原文格式 PDF
正文语种 en
中图分类

相似文献

外文文献
中文文献
专利

1. Comparative analysis of kernel-based versus ANN and deep learning methods in monthly reference evapotranspiration estimation [J] . Sattari Mohammad Taghi, Apaydin Halit, Band Shahab S., Hydrology and Earth System Sciences Discussions . 2021,第2期

机译：基于内核基于内核的比较分析和每月参考蒸发估算中的深度学习方法
2. Comparative analysis of kernel-based versus ANN and deep learning methods in monthly reference evapotranspiration estimation [J] . Sattari Mohammad Taghi, Apaydin Halit, Band Shahab S., Hydrology and Earth System Sciences . 2021,第2期

机译：基于内核基于内核的比较分析和每月参考蒸发估算中的深度学习方法
3. Applicability of Machine Learning Methods on Mobile App Effort Estimation: Validation and Performance Evaluation [J] . Mamta Pandey, Ratnesh Litoriya, Prateek Pandey International journal of software engineering and knowledge engineering . 2020,第1期

机译：机器学习方法在移动应用程序工作量估计中的适用性：验证和性能评估
4. Implementation of Data Driven Machine Learning Methods and Physics Driven Concepts for Real-Time Well Performance Estimation in Kashagan Field [C] . Adilbek Mursaliyev, Adilbek Kushekov, Ruslan Sultangaliyev SPE Annual Caspian Technical Conference . 2019

机译：数据驱动机器学习方法和物理驱动概念的实时井绩效估计，kashagan字段
5. Machine learning methods and models for ranking. [D] . Volkovs, Maksims. 2013

机译：机器学习方法和模型进行排名。
6. Estimation of the applicability domain of kernel-based machine learning models for virtual screening [O] . Nikolas Fechner, Andreas Jahn, Georg Hinselmann, 2010

机译：用于虚拟筛选的基于内核的机器学习模型的适用范围的估计
7. Comparative analysis of Kernel-based versus BFGS-ANN and deep learning methods in monthly reference evaporation estimation [O] . Mohammad Taghi Sattari, Halit Apaydin, Shahab Shamshirband, 2020

机译：基于核的比较分析，对每月参考蒸发估算中的基于内核的BFGS-ANN和深度学习方法

Kernel-Based Ranking. Methods for Learning and Performance Estimation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅