首页> 外文学位 >Sparse and large-scale learning models and algorithms for mining heterogeneous big data.

【24h】

Sparse and large-scale learning models and algorithms for mining heterogeneous big data.

机译：用于挖掘异构大数据的稀疏大规模学习模型和算法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the development of PC, internet as well as mobile devices, we are facing a data exploding era. On one hand, more and more features can be collected to describe the data, making the size of the data descriptor larger and larger. On the other hand, the number of data itself explodes and can be collected from multiple resources. When the data becomes large scale, the traditional data analysis method may fail, suffering the curse of dimensionality and etc. In order to explore and analyze the large-scale data more accurately and more efficiently, based on the characteristic of the data, we propose several learning algorithms to mine the Heterogeneous data. To be specific, if the feature dimension is large, we propose several sparse learning based feature selection methods to select the key words from the text or to find the bio-marker from the gene expression data; if the number of data itself is huge, we proposed multi-view K-Means method to do the clustering to avoid the heavy graph construction burden; if the data is represented or collected by multiple resources, we propose graph based multi-modality model to do semi-supervised learning and clustering. In addition, if the number of classes is large, we provides a global solution to the low-rank regression and proves that the low-rank regression is equivalent to doing linear regression in LDA space. We empirically evaluate each of our proposed models on several benchmark data sets and our methods can consistently achieve superior results with the comparison of state-of-art methods.

机译：随着PC，互联网以及移动设备的发展，我们正面临数据爆炸时代。一方面，可以收集越来越多的特征来描述数据，从而使数据描述符的大小越来越大。另一方面，数据本身数量激增，可以从多种资源中收集。当数据规模化时，传统的数据分析方法可能会失败，遭受维度的诅咒等。为了更准确，更有效地探索和分析大规模数据，我们提出了基于数据的特征。几种学习算法来挖掘异构数据。具体而言，如果特征量较大，我们提出了几种基于稀疏学习的特征选择方法，从文本中选择关键词或从基因表达数据中寻找生物标记。如果数据本身很大，则提出多视图K-Means方法进行聚类，避免了繁重的图构建负担。如果数据由多种资源表示或收集，我们提出基于图的多模态模型进行半监督学习和聚类。另外，如果类数很大，我们为低秩回归提供全局解决方案，并证明低秩回归等效于在LDA空间中进行线性回归。我们以经验为基础在几个基准数据集上评估我们提出的每个模型，并且通过与最新方法的比较，我们的方法可以始终如一地获得出色的结果。

著录项

作者
Cai, Xiao.;
展开▼
作者单位

The University of Texas at Arlington.;

展开▼
授予单位 The University of Texas at Arlington.;
学科 Engineering Computer.;Information Technology.
学位 Ph.D.
年度 2013
页码 130 p.
总页数 130
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Image classification algorithm based on stacked sparse coding deep learning model-optimized kernel function nonnegative sparse representation [J] . An Fengping Soft computing: A fusion of foundations, methodologies and applications . 2020,第22期

机译：基于堆叠稀疏编码深度学习模型优化内核功能非负稀疏表示的图像分类算法
2. A rapid mining model for extracting sparse distribution association semantic link from large-scale web resources [J] . Zhang Shunxiang, Lu Kui, Yin Xiaobo, International journal of ad hoc and ubiquitous computing . 2017,第1a2期

机译：从大规模网络资源中提取稀疏分布关联语义链接的快速挖掘模型
3. Application of machine learning algorithms for clinical predictive modeling: a data-mining approach in SCT. [J] . R Shouval, O Bondi, H Mishan, Bone marrow transplantation . 2014,第3期

机译：机器学习算法在临床预测建模中的应用：SCT中的数据挖掘方法。
4. Roundtable Gossip Algorithm: A Novel Sparse Trust Mining Method for Large-Scale Recommendation Systems [C] . Mengdi Liu, Guangquan Xu, Jun Zhang, International conference on algorithms and architectures for parallel processing . 2018

机译：圆桌八卦算法：大规模推荐系统的一种新的稀疏信任挖掘方法
5. Data mining techniques to enable large-scale exploratory analysis of heterogeneous scientific data. [D] . Chopra, Pankaj. 2009

机译：数据挖掘技术可实现对异构科学数据的大规模探索性分析。
6. Advanced Heterogeneous Feature Fusion Machine Learning Models and Algorithms for Improving Indoor Localization [O] . Lingwen Zhang, Ning Xiao, Wenkao Yang, 2019

机译：改进室内定位的高级异构特征融合机器学习模型和算法
7. Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models [O] . Salar Fattahi, Richard Y. Zhang, Somayeh Sojoudi 2019

机译：用于学习大型稀疏图形模型的线性时间算法

Sparse and large-scale learning models and algorithms for mining heterogeneous big data.

摘要

著录项

相似文献

相关主题

期刊订阅