首页> 外文期刊>Frontiers in Neuroscience >Keyword Spotting Using Human Electrocorticographic Recordings
【24h】

Keyword Spotting Using Human Electrocorticographic Recordings

机译:使用人类大脑皮层录音技术发现关键词

获取原文
           

摘要

Neural keyword spotting could form the basis of a speech brain-computer-interface for menu-navigation if it can be done with low latency and high specificity comparable to the “wake-word” functionality of modern voice-activated AI assistant technologies. This study investigated neural keyword spotting using motor representations of speech via invasively-recorded electrocorticographic signals as a proof-of-concept. Neural matched filters were created from monosyllabic consonant-vowel utterances: one keyword utterance, and 11 similar non-keyword utterances. These filters were used in an analog to the acoustic keyword spotting problem, applied for the first time to neural data. The filter templates were cross-correlated with the neural signal, capturing temporal dynamics of neural activation across cortical sites. Neural vocal activity detection (VAD) was used to identify utterance times and a discriminative classifier was used to determine if these utterances were the keyword or non-keyword speech. Model performance appeared to be highly related to electrode placement and spatial density. Vowel height (/a/ vs /i/) was poorly discriminated in recordings from sensorimotor cortex, but was highly discriminable using neural features from superior temporal gyrus during self-monitoring. The best performing neural keyword detection (5 keyword detections with two false-positives across 60 utterances) and neural VAD (100% sensitivity, ~1 false detection per 10 utterances) came from high-density (2 mm electrode diameter and 5 mm pitch) recordings from ventral sensorimotor cortex, suggesting the spatial fidelity and extent of high-density ECoG arrays may be sufficient for the purpose of speech brain-computer-interfaces.
机译:如果可以以与现代语音激活AI助手技术的“唤醒词”功能相当的低延迟和高特异性来完成,则神经关键词发现可以形成用于菜单导航的语音脑-计算机界面的基础。这项研究调查了神经关键词的发现,使用通过侵入性录制的脑电图信号作为语音的运动表征来作为概念证明。从单音节辅音元音发声中创建了神经匹配的过滤器:一个关键词发声,和11个类似的非关键词发声。这些过滤器用于模拟声学关键词发现问题,首次应用于神经数据。过滤器模板与神经信号互相关,捕获跨皮层位点的神经激活的时间动态。使用神经语音活动检测(VAD)来识别发声时间,并使用判别式分类器来确定这些发声是关键词还是非关键词语音。模型性能似乎与电极放置和空间密度高度相关。元音高度(/ a / vs / i /)在感觉运动皮层的记录中很难区分,但在自我监测过程中使用上颞颞回的神经特征可以高度区分。表现最佳的神经关键字检测(5个关键字检测,其中60个发声中有两个假阳性)和神经VAD(100%灵敏度,每10个发声中约有1个错误检测)来自高密度(电极直径2 mm,间距5 mm)腹侧感觉运动皮层的录音表明高密度ECoG阵列的空间保真度和范围可能足以满足语音脑机接口的目的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号