Disclosed is a video-based personal emotion recognition method using semi-supervised learning and multi-modal networks. According to one embodiment of the present invention, the video-based personal emotion recognition method may comprise the steps of: inputting one or more signals of image data, face feature point data, or voice data present in a video into a deep learning network configured based on semi-supervised learning and multi-modal networks for personal emotion recognition; and adaptively fusing each probability information obtained by the signals inputted into the deep learning network and recognizing emotions of a person in the video.
展开▼