首页>
外国专利>
SEMI-SUPERVISED SYSTEM FOR MULTICHANNEL SOURCE ENHANCEMENT THROUGH CONFIGURABLE ADAPTIVE TRANSFORMATIONS AND DEEP NEURAL NETWORK
SEMI-SUPERVISED SYSTEM FOR MULTICHANNEL SOURCE ENHANCEMENT THROUGH CONFIGURABLE ADAPTIVE TRANSFORMATIONS AND DEEP NEURAL NETWORK
展开▼
机译:通过可配置的自适应变换和深层神经网络的多通道源半监督系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
Various techniques are provided to perform enhanced automatic speech recognition. For example, a subband analysis may be performed that transforms time-domain signals of multiple audio channels in subband signals. An adaptive configurable transformation may also be performed to produce single or multichannel-based features whose values are correlated to an Ideal Binary Mask (IBM). An unsupervised Gaussian Mixture Model (GMM) model fitting the distribution of the features and producing posterior probabilities may also be performed, and the posteriors may be combined to produce deep neural network (DNN) feature vectors. A DNN may be provided that predicts oracle spectral gains from the input feature vectors. Spectral processing may be performed to produce an estimate of the target source time-frequency magnitudes from the mixtures and the output of the DNN. Subband synthesis may be performed to transform signals back to time-domain.
展开▼