This paper proposes an in-depth look at the influence of differentspeech and audio codecs on the performance of our continuous speechrecognition engine. GSM full rate, G711, G723.1 and MPEG coders areinvestigated. It is shown that MPEG transcoding degrades the speechrecognition performance for low bitrates whereas performance remainsacceptable for specialized speech coders like GSM or G711. A newstrategy is proposed to cope with degradation due to low bitrate coding.The acoustic models of the speech recognition system are trained withtranscoded speech (one acoustic model for each speech/audio codec).First results show that this strategy allows one to recover acceptableperformance
展开▼