Sequential Monte Carlo probability hypothesis density (SMC- PHD) ltering has been recently exploited for audio-visual (AV) based tracking of multiple speakers, where audio data are used to inform the particle distribution and propagation in the visual SMC-PHD lter. How- ever, the performance of the AV-SMC-PHD lter can be a ected by the mismatch between the proposal and the posterior distribution. In this pa- per, we present a new method to improve the particle distribution where audio information (i.e. DOA angles derived from microphone array mea- surements) is used to detect new born particles and visual information (i.e. histograms) is used to modify the particles with particle ow (PF). Using particle ow has the bene t of migrating particles smoothly from the prior to the posterior distribution. We compare the proposed algo- rithm with the baseline AV-SMC-PHD algorithm using experiments on the AV16.3 dataset with multi-speaker sequences.
展开▼