首页> 外国专利> MULTI-SPEAKER DIARIZATION OF AUDIO INPUT USING A NEURAL NETWORK

MULTI-SPEAKER DIARIZATION OF AUDIO INPUT USING A NEURAL NETWORK

机译:使用神经网络的音频输入的多扬声器日复速度

摘要

An audio analysis platform may receive a portion of an audio input, wherein the audio input corresponds to audio associated with a plurality of speakers. The audio analysis platform may process, using a neural network, the portion of the audio input to determine voice activity of the plurality of speakers during the portion of the audio input, wherein the neural network is trained using reference audio data and reference diarization data corresponding to the reference audio data. The audio analysis platform may determine, based on the neural network being used to process the portion of the audio input, a diarization output associated with the portion of the audio input, wherein the diarization output indicates individual voice activity of the plurality of speakers. The audio analysis platform may provide the diarization output to indicate the individual voice activity of the plurality of speakers during the portion of the audio input.
机译:音频分析平台可以接收音频输入的一部分,其中音频输入对应于与多个扬声器相关联的音频。音频分析平台可以使用神经网络,音频输入的部分来确定音频输入的部分期间多个扬声器的语音活动,其中通过参考音频数据和参考日复速数据训练神经网络到参考音频数据。音频分析平台可以基于用于处理音频输入的部分的神经网络,与音频输入的部分相关联的日复速输出,其中,日复速度输出指示多个扬声器的各个语音活动。音频分析平台可以提供日复速度输出,以指示音频输入的部分期间多个扬声器的各个语音活动。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号