首页>
外国专利>
VERY DEEP CONVOLUTIONAL NEURAL NETWORKS FOR END-TO-END SPEECH RECOGNITION
VERY DEEP CONVOLUTIONAL NEURAL NETWORKS FOR END-TO-END SPEECH RECOGNITION
展开▼
机译:用于端到端语音识别的非常深层的卷积神经网络
展开▼
页面导航
摘要
著录项
相似文献
摘要
A speech recognition neural network system includes an encoder neural network and a decoder neural network. The encoder neural network generates an encoded sequence from an input acoustic sequence that represents an utterance. The input acoustic sequence includes a respective acoustic feature representation at each of a plurality of input time steps, the encoded sequence includes a respective encoded representation at each of a plurality of time reduced time steps, and the number of time reduced time steps is less than the number of input time steps. The encoder neural network includes a time reduction subnetwork, a convolutional LSTM subnetwork, and a network in network subnetwork. The decoder neural network receives the encoded sequence and processes the encoded sequence to generate, for each position in an output sequence order, a set of sub string scores that includes a respective sub string score for each substring in a set of substrings.
展开▼