首页> 外文学位 >A theory of multitask learning for learning from disparate data sources.

【24h】

A theory of multitask learning for learning from disparate data sources.

机译：从不同数据源中学习的多任务学习理论。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many endeavors require the integration of data from multiple data sources. One major obstacle to such undertakings is the fact that different sources may vary considerably in the way they choose to represent their data, even if their data collections are otherwise perfectly compatible. In practice, this problem is usually solved by a manual construction of translations between these data representations, although there have been some recent attempts at supplementing this with automated algorithms based on machine learning methods.; This work addresses the problem of making classification predictions based on data from multiple sources, without constructing explicit translations between them. We view this problem as a special case of the problem of multitask learning problem: both intuition and much empirical work indicate that learning can be improved by attacking multiple related tasks simultaneously. However, thus far, no theoretical work has been able to support this claim, and no concrete definition has been proposed for what it means for two learning tasks to be “related.”; In this work, we introduce a general notion of relatedness between tasks, provide the standard sort of information complexity bound for such tasks, and give general conditions under which this bound is an improvement over standard single task learning results.; Finally, we apply these results to the problem of learning from disparate data sources. We give a decision tree learning algorithm for this problem for a particular type of data source disparity and demonstrate its empirical success on real data sets.

机译：许多努力要求集成来自多个数据源的数据。进行此类工作的一个主要障碍是，即使数据收集在其他方面完全兼容，不同来源在表示数据的方式上也会有很大差异。在实践中，这个问题通常通过手动构建这些数据表示之间的翻译来解决，尽管最近有一些尝试以基于机器学习方法的自动算法来补充它。这项工作解决了基于多个来源的数据进行分类预测的问题，而无需在它们之间构造显式转换。我们将此问题视为多任务学习问题的特例：直觉和大量的经验工作都表明，可以通过同时攻击多个相关任务来改善学习。但是，到目前为止，还没有理论上的工作能够支持这一主张，也没有提出具体定义来定义两个学习任务“相关”的含义。在这项工作中，我们引入了任务之间相关性的一般概念，提供了针对此类任务的标准信息复杂度范围，并给出了在这种条件下相对于标准单任务学习结果的改进的一般条件。最后，我们将这些结果应用于从不同数据源中学习的问题。针对特定类型的数据源差异，我们针对该问题给出了决策树学习算法，并证明了其在实际数据集上的经验成功。

著录项

作者
Schuller, Rebecca Ann.;
展开▼
作者单位

Cornell University.;

展开▼
授予单位 Cornell University.;
学科 Computer Science.; Mathematics.
学位 Ph.D.
年度 2003
页码 p.4467
总页数 106
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. The establishment situation of forest learning facilities: from constructing the database by four data sources. (Special Issue: The development and the practical studies in forest environmental education.) [Japanese] [J] . Kiyama K., Inoue M., Oishi Y., 日本森林学会誌 . 2014,第1期

机译：森林学习设施的建立状况：由四个数据源构成的数据库。（特刊：森林环境教育的发展和实践研究。）[日语]
2. Data-driven multitask sparse dictionary learning for noise attenuation of 3D seismic data [J] . Siahsar Mohammad Amir Nazari, Gholtashi Saman, Kahoo Amin Roshandel, Geophysics: Journal of the Society of Exploration Geophysicists . 2017,第6期

机译：数据驱动的多任务稀疏字典学习3D地震数据的噪声衰减
3. Multitask Metric Learning: Theory and Algorithm [J] . Boyu Wang, Hejia Zhang, Peng Liu, JMLR: Workshop and Conference Proceedings . 2018,第12期

机译：多任务度量学习：理论与算法
4. Transformation learning based domain adaptation for robust classification of disparate hyperspectral data [C] . Xiong Zhou, Saurabh Prasad International Geoscience and Remote Sensing Symposium . 2017

机译：基于变换学习的域自适应，可对不同的高光谱数据进行鲁棒分类
5. Aprendizaje maquina multitarea mediante edicion de datos y algoritmos de aprendizaje extremo. multitask learning with data editing and extreme learning machine. [D] . Bueno Crespo, Andres. 2013

机译：通过数据编辑和极限学习算法进行多任务机器学习。具有数据编辑和极限学习机的多任务学习。
6. Unsupervised Learning and Pattern Recognition of Biological Data Structures with Density Functional Theory and Machine Learning [O] . Chien-Chang Chen, Hung-Hui Juan, Meng-Yuan Tsai, -1

机译：基于密度泛函理论和机器学习的生物数据结构的无监督学习和模式识别
7. DCASE 2019 Task 2: Multitask Learning, Semi-supervised Learning and Model Ensemble with Noisy Data for Audio Tagging [O] . Osamu Akiyama, Junya Sato 2019

机译：DCEAD 2019任务2：多任务学习，半监督学习和模型与音频标记有噪声数据

A theory of multitask learning for learning from disparate data sources.

摘要

著录项

相似文献

相关主题

期刊订阅