首页> 外文学位 >Feature selection and statistical alternatives for machine learning applied to in-silico drug design.

【24h】

Feature selection and statistical alternatives for machine learning applied to in-silico drug design.

机译：用于计算机学习药物的机器学习的特征选择和统计替代方案。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Feature selection has recently been the subject of intensive research in data mining, especially for datasets with a large number of descriptive attributes such as QSAR (Quantitative Activity Structure Relationship) data. QSAR is an in-silico drug design methodology, which requires identifying important features of molecules that explain a relevant drug property. A typical QSAR dataset for predicting an activity of interest is characterized by a large number of descriptive features (300–1000) for a relatively small number of compounds (molecules).; Finding the best feature subset for a given problem with N number of features requires evaluating all 2N possible subsets. The best feature subset also depends on the predictive modeling, which will be employed to predict the future unknown values of response variables of interest. Feature selection involves minimizing the number of relevant features for maximizing the predictive power of the model. From this point of view feature selection can be viewed as a special type of multi-objective optimization problem.; This dissertation proposes machine learning algorithms as predictive modeling tools for QSAR problems, and develops a novel approach for feature selection based on feature saliency. In addition, this approach is computationally less expensive than other machine learning feature selection methods (i.e., weight pruning for ANNs), and it works for any nonparametric regression algorithm.

机译：特征选择最近已成为数据挖掘中深入研究的主题，尤其是对于具有大量描述性属性的数据集，例如QSAR（定量活动结构关系）数据。 QSAR是一种 insilico 药物设计方法，它需要识别能够解释相关药物特性的分子的重要特征。典型的QSAR数据集可预测感兴趣的活性，其特征是相对少量的化合物（分子）具有大量的描述特征（300-1000个）。为具有N个特征的给定问题找到最佳特征子集，需要评估所有2 N 个可能子集。最佳特征子集还取决于预测模型，该模型将用于预测感兴趣的响应变量的未来未知值。特征选择包括最小化相关特征的数量以最大化模型的预测能力。从这个角度来看，特征选择可以看作是一种特殊的多目标优化问题。本文提出了机器学习算法作为QSAR问题的预测建模工具，并提出了一种基于特征显着性的特征选择新方法。另外，该方法在计算上比其他机器学习特征选择方法（即，对ANN的权重修剪）便宜，并且适用于任何非参数回归算法。

著录项

作者
Arciniegas, Fabio Andres.;
展开▼
作者单位

Rensselaer Polytechnic Institute.;

展开▼
授予单位 Rensselaer Polytechnic Institute.;
学科 Operations Research.; Engineering Industrial.
学位 Ph.D.
年度 2002
页码 250 p.
总页数 250
原文格式 PDF
正文语种 eng
中图分类运筹学;一般工业技术;
关键词

相似文献

外文文献
中文文献
专利

1. GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation [J] . Adriano L.I. Oliveira, Petronio L. Braga, Ricardo M.F. Lima, Information and software technology . 2010,第11期

机译：基于遗传算法的机器学习回归的特征选择和参数优化方法在软件工作量估计中的应用
2. Resting-State Functional Network Scale Effects and Statistical Significance-Based Feature Selection in Machine Learning Classification [J] . Hao Guo, Yao Li, Godfred Kim Mensah, Computational and mathematical methods in medicine . 2019,第1期

机译：在机器学习分类中休息状态功能网络缩放效果和基于统计显着性的特征选择
3. Machine learning and feature selection for drug response prediction in precision oncology applications [J] . Mehreen Ali, Tero Aittokallio Biophysical reviews . 2019,第1期

机译：精密肿瘤学应用中药物反应预测的机器学习和特征选择
4. Hybrid Model of Correlation Based Filter Feature Selection and Machine Learning Classifiers Applied on Smart Meter Data Set [C] . Janvier Omar Sinayobye, Swaib Kaawaase Kyanda, N. Fred Kiwanuka, IEEE/ACM Symposium on Software Engineering in Africa . 2019

机译：基于关联的过滤器特征选择与机器学习分类器混合模型在智能电表数据集上的应用
5. Non-redundant clustering, principal feature selection and learning methods applied to lung tumor image-guided radiotherapy. [D] . Cui, Ying. 2009

机译：非冗余聚类，主要特征选择和学习方法应用于肺肿瘤图像引导放疗。
6. Resting-State Functional Network Scale Effects and Statistical Significance-Based Feature Selection in Machine Learning Classification [O] . Hao Guo, Yao Li, Godfred Kim Mensah, 2019

机译：机器学习分类中的静止状态功能网络规模效应和基于统计意义的特征选择
7. Creation of databases of ageing-related drugs and statistical analysis and applied machine learning for the prioritization of potential lifespan-extension drugs [O] . Barardo Diogo Gonçalves 2016

机译：建立与老化有关的药物数据库，并进行统计分析和应用机器学习，以优先考虑潜在的延长寿命的药物

Feature selection and statistical alternatives for machine learning applied to in-silico drug design.

摘要

著录项

相似文献

相关主题

期刊订阅