首页> 外文学位 >Generalizaciones de minimos cuadrados parciales con aplicacion en clasificacion supervisada (Spanish text).

【24h】

Generalizaciones de minimos cuadrados parciales con aplicacion en clasificacion supervisada (Spanish text).

机译：偏最小二乘的推广及其在监督分类中的应用（西班牙语）。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The development of technologies such as microarrays has generated a large amount of data. The main characteristic of this kind of data it is the large number of predictors (genes) and few observations (experiments). Thus, the data matrix X is of order n x p, where n is much smaller than p. Before using any multivariate statistical technique, such as regression and classification, to analyze the information contained in this data, we need to apply either feature selection methods and/or dimensionality reduction using orthogonal variables, in order to eliminate multicollineality among the predictor variables that can lead to severe prediction errors, as well as to a decrease of the computational burden required to build and validate the classifier.; Principal component analysis (PCA) is a technique that has being used for some time to reduce the dimensionality. However, the first components that have the most variability of the data structure do not necessarily improve the prediction when it is used for regression and classification (Yeung and Ruzzo, 2001). Partial least squares (PLS), introduced by Wold (1975), was an important contribution to reduce dimensionality in a regression context using orthogonal components. The certainty that first PLS components improve the prediction has made PLS a widely technique used particularly in the area of chemistry, known as Chemometrics. Nguyen and Rocke (2002), working on supervised classification methods for microarray data, reduced the dimensionality by applying first feature selection using statistical techniques such as difference of means and analysis of variance, after which they applied PLS regression considering the vector of classes (a categorical variable) as a response vector (continuous variable). This procedure is not adequate since the predictions are not necessarily integers and they must be rounded up, losing accuracy. In spite of these shortcomings, regression PLS yields reasonable results.; In this thesis work we implement generalizations of regression PLS as a dimensionality reduction technique to be applied in supervised classification. We extend a technique introduced by Bastien et al. (2002), who combined PLS with ordinal logistic regression for multiclass problems. However, since it is very uncommon to have ordered classes, in this work it has been combined PLS with nominal logistic regression. It was also considered the multivariate PLS along with logistic regression, as well as the construction of PLS components from linear discriminant analysis, and projection pursuit. The proposals presented in this thesis improve two recent results by Fort and Lambert (2004), and Ding and Gentleman (2004), combining logistic regression and PLS that are suitable only for datasets with two classes. A library of R functions was built to carry out the different proposals.

机译：诸如微阵列之类的技术的发展已经产生了大量数据。这种数据的主要特征是大量的预测变量（基因）和较少的观测值（实验）。因此，数据矩阵X的阶数为n x p，其中n远小于p。在使用任何多元统计技术（例如回归和分类）来分析此数据中包含的信息之前，我们需要应用特征选择方法和/或使用正交变量进行降维，以消除预测变量之间的多重共线性。导致严重的预测错误，并减少了建立和验证分类器所需的计算负担。主成分分析（PCA）是一种已使用一段时间以降低尺寸的技术。但是，当将数据结构用于变异和分类时，具有最大可变性的第一个组件并不一定会改善预测（Yeung和Ruzzo，2001年）。 Wold（1975）引入的偏最小二乘（PLS）是在使用正交分量的回归上下文中降低维数的重要贡献。 PLS最初的成分可以提高预测的确定性已使PLS成为一种广泛使用的技术，尤其是在化学领域，即化学计量学。 Nguyen和Rocke（2002）致力于微阵列数据的监督分类方法，通过使用统计技术（例如均值差和方差分析）应用第一个特征选择来降低维数，然后他们考虑类向量来应用PLS回归（a分类变量）作为响应向量（连续变量）。此过程并不足够，因为预测不一定是整数，并且必须将其四舍五入，从而失去准确性。尽管有这些缺点，回归PLS仍可得出合理的结果。在本文中，我们将回归PLS的推广作为一种降维技术应用于监督分类。我们扩展了由Bastien等人介绍的技术。（2002年），他将PLS与序数逻辑回归相结合来解决多类问题。但是，由于有序类非常少见，因此在这项工作中将PLS与名义Logistic回归相结合。它也被认为是多元PLS以及logistic回归，以及从线性判别分析和投影追踪中构造PLS组件。本文提出的建议改进了Fort和Lambert（2004）以及Ding和Gentleman（2004）的两个最近的结果，它们将逻辑回归和PLS组合在一起，仅适用于两类数据集。建立了R函数库来执行不同的建议。

著录项

作者
Vega Vilca, Jose Carlos.;
展开▼
作者单位

University of Puerto Rico, Mayaguez (Puerto Rico).;

展开▼
授予单位 University of Puerto Rico, Mayaguez (Puerto Rico).;
学科 Computer Science.; Statistics.
学位 Ph.D.
年度 2004
页码 118 p.
总页数 118
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;统计学;
关键词

相似文献

外文文献
中文文献
专利

1. CLASIFICACION NO SUPERVISADA CON IMAGENES A COLOR DE COBERTURA TERRESTRE [J] . Antonia Macedo-Cruz, Gonzalo Pajares-Martinsanz, Matilde Santos-Penas Agrociencia . 2010,第6期

机译：土地覆盖物彩色图像的非监督分类
2. Efecto determinante de la motivación de viaje sobre la imagen de destino en turistas de ocio a un destino urbano: el caso de Monterrey, México. Una aproximación por medio de mínimos cuadrados parciales (PLS) [J] . José T. Olague Turismo y Sociedad . 2016,第1期

机译：确定旅行动机对休闲游客到城市目的地的目的地形象的影响：墨西哥蒙特雷。通过偏最小二乘（PLS）进行的近似
3. APLICACION DE UNA LENGUA ELECTRONICA VOLTAMETRICA PARA LA CLASIFICACION DE VINOS Y ESTUDIO DE CORRELACION CON LA CA RACTERIZACION QUIMICA Y SENSORIAL [J] . Alvaro A. Arrieta, Maria L. Rodriguez-Mendez, Jose A. De Saja Quimica nova . 2010,第4期

机译：伏安电子语言在葡萄分类中的应用及与化学和感官特性的相关性研究
4. Dimensionamiento de dos sistema de refrigeration desde el punto de vista Tecnico Economico en una aplicacion de baja temperatura, con las alternativas de un sistema en cascada y un sistema en doble etapa [C] . Juan Manuel Annual Meeting of International Institute of Ammonia Refrigeration . 2010

机译：两种制冷系统从低温应用中的经济技术角度的尺寸，瀑布系统的替代品和双级系统
5. Procesos de pensamiento en la solucion de problemas de aplicacion de administracion de empresas con el uso de la calculadora grafica (Spanish text). [D] . Velazquez Rosado, Wanda. 2002

机译：通过使用图形计算器（西班牙语）解决业务管理应用程序问题的思维过程。
6. Incidencia de la fractura de cadera osteoporótica en Galicia en relación con la dispensación de medicamentos con indicación en su prevención y tratamiento [O] . María Mercedes Guerra-García, José Benito Rodríguez-Fernández, Elías Puga-Sarmiento, 2011

机译：加利西亚骨质疏松性髋部骨折的发生率与药物的分配有关以指示其预防和治疗
7. Determinación de nitrógeno foliar en palma de aceite con espectroscopía en el infrarrojo medio (mir) y cercano (nir) por el método de regresión de mínimos cuadrados parciales de componentes principales (pls). [O] . Jhoan Jose Crespo Gonzalez, Orlando Simon Ruiz Villadiego, Karen Stefanie Ospino Villalba 2020

机译：主要组分（PLS）部分正方形的回归法测定介质红外（MIR）光谱法测定油棕中叶面氮的测定。
8. Aplicacion Del Metodo de Diferencias Finitas a la Resolucion de Ecuaciones Diferenciales en Derivadas Parciales Mixtas Para El Caso de Flujo Transonico (Application of the Finite Difference Method to the Solution of Mixed Partial [R] . Mongegomez, F. 1987

机译：aplicacion Del metodo de Diferencias Finitas a la Resolucion de Ecuaciones Diferenciales en Derivadas parciales mixtas para El Caso de Flujo Transonico（有限差分法在混合部分解决方案中的应用）

Generalizaciones de minimos cuadrados parciales con aplicacion en clasificacion supervisada (Spanish text).

摘要

著录项

相似文献

相关主题

期刊订阅