The modern technology has enabled very high dimensional multimodal data streams to be routinely acquired, which results in very high dimensional feature spaces (p) as compared to number of training samples (n). In this regard, the paper presents a new feature extraction algorithm to address the 'small n and large p' problem associated with multimodal data sets. It judiciously integrates both regularization and shrinkage with canonical correlation analysis (CCA). While the diagonal elements of covariance matrices are increased using regularization parameters, the off-diagonal elements are decreased by shrinkage parameters. The theory of rough sets is used to find out the optimum regularization parameters of CCA. The effectiveness of the proposed method, along with a comparison with other methods, is demonstrated on three pairs of modalities of two real life data sets.
展开▼