Regression learning belongs to supervised learning, which is to build models on examples with real-valued labels. It usually needs a great amount of training samples to obtain significant performance. However, there are few training samples that can be collected in real applications. Aiming at this problem, the neural network ensemble to regression tree ( NERT ) algorithm is proposed based on the twice learning framework. By means of the virtual sample generation technique, this method makes effective utilization of two sequential learning stages to relieve the problem of insufficient training samples for enhancing its performance. By choosing two methods with high generalization ability and significant comprehensibility respectively for the two stages, a model with two characteristics can be obtained. Results on software effort estimation with few training samples show that NERT is capable of achieving better performance from these small data than existing methods, and reveals the key factors within effort estimation effectively due to its inherent comprehensibility.%回归学习是用于对具有实值标记样本进行学习建模的监督学习技术。为获得良好的预测性能,通常需要大量的训练样本,然而,在实际应用中可收集到的训练样本数量极少。针对该问题,提出一种基于二次学习框架的新型二次回归学习方法———基于神经网络集成的回归树算法( NERT)。该方法借助虚拟样本生成技术,通过串行执行的两个学习阶段对其进行有效利用,有效缓解训练样本不足的困难,从而提升学习性能。同时,通过为两个阶段分别选择泛化能力强和理解性好的学习方法,可得到预测性能好且可理解性高的模型。实验结果表明在训练样本极少的软件开发工作量预测问题上,NERT方法能够从小样本数据得到比现有方法更好的预测性能,同时其模型内在可理解性能够揭示工作量预测的关键因素。
展开▼