An apparatus for predicting a human depression level by analyzing a fine facial expression according to an embodiment of the present invention includes a data storage unit for storing video data, a spatial information generation unit for generating spatial information from video data, and three consecutive extracted frames, generate VLDN (volume local directional number) feature maps to analyze facial dynamics based on successive frames, and input into CNN (Deep Convolutional Neural Network) model to generate dynamic information about facial movements It includes a VLDN feature map generator that generates spatial information and dynamic information as an output value through a Temporal Median Pooling (TMP) method, and a predictor that predicts the human depression level based on a recursive neural network based on the output value.
展开▼