A Music Emotion Classification Model Based on the Improved Convolutional Neural Network

Xiaosong Jia

摘要

Aiming at the problems of music emotion classification, a music emotion recognition method based on the convolutional neural network is proposed. First, the mel-frequency cepstral coefficient (MFCC) and residual phase (RP) are weighted and combined to extract the audio low-level features of music, so as to improve the efficiency of data mining. Then, the spectrogram is input into the convolutional recurrent neural network (CRNN) to extract the time-domain features, frequency-domain features, and sequence features of audio. At the same time, the low-level features of audio are input into the bidirectional long short-term memory (Bi-LSTM) network to further obtain the sequence information of audio features. Finally, the two parts of features are fused and input into the softmax classification function with the center loss function to achieve the recognition of four music emotions. The experimental results based on the emotion music dataset show that the recognition accuracy of the proposed method is 92.06, and the value of the loss function is about 0.98, both of which are better than other methods. The proposed method provides a new feasible idea for the development of music emotion recognition.

机译：针对音乐情感分类问题，提出一种基于卷积神经网络的音乐情感识别方法。首先，对梅尔频率倒谱系数（MFCC）和残差相位（RP）进行加权和组合，提取音乐的音频低电平特征，从而提高数据挖掘效率;然后，将频谱图输入到卷积循环神经网络（CRNN）中，提取音频的时域特征、频域特征和序列特征。同时，将音频的低级特征输入到双向长短期记忆（Bi-LSTM）网络中，进一步获取音频特征的序列信息。最后，将两部分特征融合并输入到具有中心损失函数的softmax分类函数中，实现对四种音乐情感的识别。基于情感音乐数据集的实验结果表明，所提方法的识别准确率为92.06%，损失函数值约为0.98，均优于其他方法。所提方法为音乐情感识别的发展提供了新的可行思路。

A Music Emotion Classification Model Based on the Improved Convolutional Neural Network

摘要

著录项

引文网络

相关主题

期刊订阅