首页> 外国专利> Multichannel speech recognition using neural networks

Multichannel speech recognition using neural networks

机译：使用神经网络的多通道语音识别

页面导航

摘要
著录项
相似文献

摘要

This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.

机译：该规范描述了计算机实现的方法和系统。一种方法包括通过语音识别系统的神经网络接收，第一数据表示第一原始音频信号和表示第二原始音频信号的第二数据。第一原始音频信号和第二原始音频信号描述在同一时间段内发生的音频。该方法还包括由神经网络的空间滤波层生成使用第一数据和第二数据的空间滤波输出，并由神经网络的光谱滤波层生成使用空间过滤的频谱滤波输出输出。生成光谱滤波输出包括处理表示空间滤波输出的频域数据。该方法还进一步包括由神经网络的一个或多个附加层处理，频谱滤波输出以预测在第一原始音频信号和第二原始音频信号中编码的子字单元。

著录项

公开/公告号US11062725B2

专利类型
公开/公告日2021-07-13

原文格式PDF
申请/专利权人 GOOGLE LLC;
展开▼

申请/专利号US201916278830
发明设计人 EHSAN VARIANI;KEVIN WILLIAM WILSON;RON J. WEISS;TARA N. SAINATH;ARUN NARAYANAN;
展开▼

申请日2019-02-19
分类号G10L15/16;G10L25/30;G10L21/028;G10L21/0388;G10L19/008;G10L15/20;G10L21/0208;G10L21/0216;
国家 US
入库时间 2022-08-24 19:54:09

相似文献

专利
外文文献
中文文献