首页> 美国卫生研究院文献>Nature Communications >Combining predictive coding and neural oscillations enables online syllable recognition in natural speech

【2h】

Combining predictive coding and neural oscillations enables online syllable recognition in natural speech

机译：结合预测编码和神经振荡可以在自然语音中实现在线音节识别

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

The bottom level encodes the dynamics in the input signal, which consists of two parts; the condensed auditory spectrogram (on the right) and the slow amplitude modulation of the input signal (on the left) derived from applying a spectrotemporal filter to the spectrogram . The theta module is modelled by a canonical theta-neuron model , which is fed with the slow amplitude modulation that the model infers from the continuous speech signal. Whenever theta oscillations reach a predefined phase, the model generates a Gaussian pulse, referred to as (red pulses under ‘Syllable onsets’). Depending on the input, theta triggers appear sooner or later and constitute the model’s estimates of syllable onsets. This information is used to reset gamma activity in the spectrotemporal module (solid arrow from theta to spectrotemporal module). Similarly, the instantaneous frequency/rate of the theta oscillator is used to set the preferred rate of the gamma sequence (dashed red line from theta to spectrotemporal module). Together gamma and syllable units encode the dynamics of the frequency channels in the input. The last (8th) gamma unit represents the model’s estimate about the syllable offset (based on their pre-learned spectral structure); hence it is used to reset syllable units to a common value (upward arrows). During the inference process, the activation level of each syllable unit changes based on bottom-up prediction errors. The identified syllables are readout from the dynamics of syllable units. A simplified diagram of the model indicating the functional connections. The solid arrow from the theta module ( ) to gamma units ( ) indicates the reset of gamma activity. The dashed red line represents rate information received from the theta oscillation. Finally, the arrow from gamma to syllable units ( ), indicates the reset of the syllable units.

机译：底层对输入信号中的动态进行编码，它由两部分组成：压缩的听觉频谱图（右侧）和输入信号的慢幅度调制（左侧），是通过对频谱图应用频谱时域滤波器得出的。 Theta模块由规范的theta-neuron模型建模，该模型由模型从连续语音信号中推断出的慢幅度调制提供信号。每当theta振荡达到预定相位时，模型就会生成一个高斯脉冲，称为（“音节起始”下的红色脉冲）。根据输入的不同，theta触发迟早会出现，并构成模型对音节发作的估计。此信息用于重置光谱时间模块中的伽马活动（从theta到光谱时间模块的实线箭头）。同样，theta振荡器的瞬时频率/速率用于设置伽马序列的首选速率（从theta到光谱时模的红色虚线）。伽马和音节单元一起对输入中的频道动态进行编码。最后一个（第8个）伽玛单位表示模型对音节偏移的估计（基于其预先学习的频谱结构）;因此，它用于将音节单位重置为一个公共值（向上箭头）。在推论过程中，每个音节单元的激活水平会根据自下而上的预测误差而变化。从音节单元的动态中读出识别出的音节。模型的简化图，指示功能连接。从theta模块（）到伽玛单位（）的实线箭头指示伽玛活动的重置。红色虚线表示从θ振荡接收的速率信息。最后，从伽马到音节单位（）的箭头指示音节单位的重置。

著录项

期刊名称 Nature Communications
作者
Sevada Hovsepyan; Itsaso Olasagasti; Anne-Lise Giraud;
展开▼
作者单位

展开▼
年(卷),期 -1(11),-1
年度 -1
页码 -1
总页数 12
原文格式 PDF
正文语种
中图分类
关键词
Computational neuroscience; Sensory processing;

机译：计算神经科学;感官处理;

相似文献

外文文献
中文文献
专利

1. Combining predictive coding and neural oscillations enables online syllable recognition in natural speech [J] . Sevada Hovsepyan, Itsaso Olasagasti, Anne-Lise Giraud Nature Communications . 2020,第1期

机译：组合预测编码和神经振荡使在线语音中的在线音节识别
2. Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature [J] . Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara EURASIP journal on advances in signal processing . 2015,第1期

机译：结合了深度神经网络和深度自动编码器的混响语音识别，并增强了电话类功能
3. Recognition of time-compressed speech does not predict recognition of natural fast-rate speech by older listeners [J] . Gordon-Salant Sandra, Zion Danielle J., Espy-Wilson Carol The Journal of the Acoustical Society of America . 2014,第4aPta1期

机译：时间压缩语音的识别无法预测年长听众对自然快速语音的识别
4. RECOGNITION OF SYLLABLES IN A CONTINUOUS STREAM OF SPEECH BY PARCOR PARAMETERS OF LINEAR PREDICTIVE VOCODER [C] . Ying Cui, Kunio Takaya Canadian Conference on Electrical and Computer Engineering . 2005

机译：通过线性预测声学的Parcor参数在连续的语音中识别音节
5. A neural predictive HMM architecture for speech and speaker recognition. [D] . Hassanein, Khaled Saad. 1994

机译：用于语音和说话者识别的神经预测HMM架构。
6. Focal Manipulations of Formant Trajectories Reveal a Role of Auditory Feedback in the Online Control of Both Within-Syllable and Between-Syllable Speech Timing [O] . Shanqing Cai, Satrajit S. Ghosh, Frank H. Guenther, 2011

机译：共振峰轨迹的焦点操纵揭示了语音反馈在音节内和音节间语音定时在线控制中的作用
7. Combining predictive coding and neural oscillations enables online syllable recognition in natural speech [O] . Sevada Hovsepyan, Itsaso Olasagasti, Anne-Lise Giraud 2020

机译：组合预测编码和神经振荡使在线语音中的在线音节识别

Combining predictive coding and neural oscillations enables online syllable recognition in natural speech

摘要

著录项

相似文献

相关主题

期刊订阅