An arrangement for yielding enhanced audio features towards the provision of enhanced audio-visual features for speech recognition. Input is provided in the form of noisy audio-visual features and noisy audio features related to the noisy audio-visual features.
展开▼