首页> 外国专利> MULTI-SPEAKER DIARIZATION OF AUDIO INPUT USING A NEURAL NETWORK

MULTI-SPEAKER DIARIZATION OF AUDIO INPUT USING A NEURAL NETWORK

机译：使用神经网络的音频输入的多扬声器日复速度

页面导航

摘要
著录项
相似文献

摘要

An audio analysis platform may receive a portion of an audio input, wherein the audio input corresponds to audio associated with a plurality of speakers. The audio analysis platform may process, using a neural network, the portion of the audio input to determine voice activity of the plurality of speakers during the portion of the audio input, wherein the neural network is trained using reference audio data and reference diarization data corresponding to the reference audio data. The audio analysis platform may determine, based on the neural network being used to process the portion of the audio input, a diarization output associated with the portion of the audio input, wherein the diarization output indicates individual voice activity of the plurality of speakers. The audio analysis platform may provide the diarization output to indicate the individual voice activity of the plurality of speakers during the portion of the audio input.

机译：音频分析平台可以接收音频输入的一部分，其中音频输入对应于与多个扬声器相关联的音频。音频分析平台可以使用神经网络，音频输入的部分来确定音频输入的部分期间多个扬声器的语音活动，其中通过参考音频数据和参考日复速数据训练神经网络到参考音频数据。音频分析平台可以基于用于处理音频输入的部分的神经网络，与音频输入的部分相关联的日复速输出，其中，日复速度输出指示多个扬声器的各个语音活动。音频分析平台可以提供日复速度输出，以指示音频输入的部分期间多个扬声器的各个语音活动。

著录项

公开/公告号WO2021045990A1

专利类型
公开/公告日2021-03-11

原文格式PDF
申请/专利权人 THE JOHNS HOPKINS UNIVERSITY;HITACHI LTD.;
展开▼

申请/专利号WO2020US48730
发明设计人 FUJITA YUSUKE;WATANABE SHINJI;KANDA NAOYUKI;HORIGUCHI SHOTA;
展开▼

申请日2020-08-31
分类号G10L17/18;G10L15/08;
国家 US
入库时间 2022-08-24 17:41:12

相似文献

专利
外文文献
中文文献