首页>
美国卫生研究院文献>BMC Bioinformatics
>Prediction using step-wise L1 L2 regularization and feature selection for small data sets with large number of features
【2h】
Prediction using step-wise L1 L2 regularization and feature selection for small data sets with large number of features
BackgroundMachine learning methods are nowadays used for many biological prediction problems involving drugs, ligands or polypeptide segments of a protein. In order to build a prediction model a so called training data set of molecules with measured target properties is needed. For many such problems the size of the training data set is limited as measurements have to be performed in a wet lab. Furthermore, the considered problems are often complex, such that it is not clear which molecular descriptors (features) may be suitable to establish a strong correlation with the target property. In many applications all available descriptors are used. This can lead to difficult machine learning problems, when thousands of descriptors are considered and only few (e.g. below hundred) molecules are available for training.
展开▼