The disclosure describes an audio-based emotion recognition system that is able to classify emotions in real-time. The emotion recognition system, according to some embodiments, adjusts the behavior of intelligent systems, such as a virtual coach, depending on the user's emotion, thereby providing an improved user experience. Embodiments of the emotion recognition system and method use short utterances as real-time speech from the user and use prosodic and phonetic features, such as fundamental frequency, amplitude, and Mel-Frequency Cepstral Coefficients, as the main set of features by which the human speech is characterized. In addition, certain embodiments of the present invention use One-Against-All or Two-Stage classification systems to determine different emotions. A minimum-error feature removal mechanism is further provided in alternate embodiments to reduce bandwidth and increase accuracy of the emotion recognition system.
展开▼