At present,the models used in music auto-tagging are mostly hand-engineered,so the choice of the optimal feature is always difficult.We propose an unsupervised feature learning algorithm,which can automatically learn the underlying structure of feature without prior knowledge.The algorithm is achieved in three stages.The preprocessing stage extracts the chroma-frequency spectrogram,and reduces the dimensionality via PCA whitening.The second stage applies the denoising autoencoder to the reduced feature in an unsupervised manner,and aggregates a new feature vector by max-pooling function and averaging.The last stage maps the feature vector to song labels by pre-trained multilayer perceptron (MLP) in a supervised manner.The result based on the Magnatagatune and GTZAN datasets shows that our algorithm improves the accuracy of music auto-tagging to some degree.%目前,音乐自动标注模型大多采用手动设计模式,因而存在最佳特征难以选择的问题.提出了一种基于非监督学习的特征学习算法,该算法能自动学习特征的潜在结构而不需要依赖先验知识.首先,预处理阶段主要提取音乐的音级轮廓频率谱并进行PCA白化降维处理;然后,采用深度学习中的降噪自动编码器算法对降维后的特征进行无监督的学习,并采用最大值池化和取均值来聚合得到新的特征向量;最后,将特征向量和标签送入多层感知机中进行有监督的学习.基于Magnatagatune和GTZAN数据库的实验结果表明,本文算法在一定程度上提高了音乐自动标注的准确率.
展开▼