The issue of automatic selection of the current active speaker among more than thirty participants located in the medium-sized meeting room is considered. Techniques of video tracking and sound source localization are implemented for recording AVI files of speaker remarks in the developed smart meeting room. Video processing of streams from five cameras serves for registration of participants in fixed chair positions, tracking main speaker based on histogram comparison and AdaBoosted cascade classifier for face detection. Multichannel sound source localization based on GCC-PHAT method is used for estimation of the speaker position by four microphone arrays. In the 18dB SNR case the sound source localization rate was about 97% and fine RMSE was lower 0.23 m.
展开▼