首页> 外国专利> SEMI-SUPERVISED SYSTEM FOR MULTICHANNEL SOURCE ENHANCEMENT THROUGH CONFIGURABLE ADAPTIVE TRANSFORMATIONS AND DEEP NEURAL NETWORK

SEMI-SUPERVISED SYSTEM FOR MULTICHANNEL SOURCE ENHANCEMENT THROUGH CONFIGURABLE ADAPTIVE TRANSFORMATIONS AND DEEP NEURAL NETWORK

机译：通过可配置的自适应变换和深层神经网络的多通道源半监督系统

页面导航

摘要
著录项
相似文献

摘要

Various techniques are provided to perform enhanced automatic speech recognition. For example, a subband analysis may be performed that transforms time-domain signals of multiple audio channels in subband signals. An adaptive configurable transformation may also be performed to produce single or multichannel-based features whose values are correlated to an Ideal Binary Mask (IBM). An unsupervised Gaussian Mixture Model (GMM) model fitting the distribution of the features and producing posterior probabilities may also be performed, and the posteriors may be combined to produce deep neural network (DNN) feature vectors. A DNN may be provided that predicts oracle spectral gains from the input feature vectors. Spectral processing may be performed to produce an estimate of the target source time-frequency magnitudes from the mixtures and the output of the DNN. Subband synthesis may be performed to transform signals back to time-domain.

机译：提供了各种技术来执行增强的自动语音识别。例如，可以执行子带分析，该子带分析将子带信号中的多个音频通道的时域信号变换。还可以执行自适应可配置转换以产生基于单通道或多通道的特征，这些特征的值与理想二进制掩码（IBM）相关。也可以执行适合特征分布并产生后验概率的无监督高斯混合模型（GMM）模型，并且可以组合后验模型以产生深度神经网络（DNN）特征向量。可以提供从输入特征向量预测预言谱增益的DNN。可以执行频谱处理以从混合物和DNN的输出产生目标源时频幅度的估计。可以执行子带合成以将信号变换回时域。

著录项

公开/公告号US2017162194A1

专利类型
公开/公告日2017-06-08

原文格式PDF
申请/专利权人 CONEXANT SYSTEMS INC.;
展开▼

申请/专利号US201615368452
发明设计人 XIANGYUAN ZHAO;FRANCESCO NESTA;TRAUSTI THORMUNDSSON;
展开▼

申请日2016-12-02
分类号G10L15/20;G10L15/02;G10L21/0316;G10L25/84;G10L21/038;G10L15/16;G10L15/14;
国家 US
入库时间 2022-08-21 13:47:09

相似文献

专利
外文文献
中文文献