We consider multi-view classification for the challenging scenario where, for some views, there are no labeled training examples. Several discriminative approaches have been recently proposed for special instances of this problem. Here, alternatively, we propose a generative semi-supervised mixture model across all views which, via marginalization, flexibly performs exact class inference, given any subset of available views. The proposed model is an extension of semi-supervised mixtures to a multi-view setting, as well as a semi-supervised extension of mixtures of factors analyzers (MFA)[1]. A novel EM algorithm with a computationally efficient E-step is derived for learning our multi-view model. Specialization of this formulation to the standard MFA problem also gives a reduced complexity E-step, compared to the original EM algorithm proposed for MFA. Our multi-view method is experimentally demonstrated on digit recognition using audio and lip video views, achieving competitive results with alternative, discriminative approaches.
展开▼