高维数据流包含大量的无关信息和冗余信息,这些信息可能极大地降低学习算法的性能.利用属性相关性可以有效地去除数据流中的不相关属性和冗余属性,提高学习算法的效率.分析现有的属性相关性计算方法在应用中的局限性,提出基于曲线拟合的属性相关性特征选择算法FSCFFR(Feature Selection based on Curve-Fitting Feature Relevance).理论分析和实验表明,FSCFFR在特征选择过程中具有较高的实时性和有效性.%High dimensional data stream contains a lot of irrelevant and redundant information, which may greatly downgrade the performance of learning algorithms. With attribute relevance, the irrelevant and redundant attributes can be effectively removed. As a result, the efficiency of learning algorithms can be improved. The paper analyzes the limitations of existing attribute relevance calculation methods and proposes an attribute relevance feature selection algorithm based on curve-fitting, called Feature Selection based on Curve-Fitting Feature Relevance (FSCFFR). Both theoretical analysis and experiments have illustrated that FSCFFR is more real-time and more effective during the feature selection process.
展开▼