首页> 外文期刊>NTT Technical Review >SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker’s Voice Characteristics
【24h】

SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker’s Voice Characteristics

机译:SpeakerBeam:一种新的深度学习技术,用于根据说话者的语音特征提取目标说话者的语音

获取原文
       

摘要

In a noisy environment such as a cocktail party, humans can focus on listening to a desired speaker, an ability known as selective hearing. Current approaches developed to realize computational selective hearing require knowing the position of the target speaker, which limits their practical usage. This article introduces SpeakerBeam, a deep learning based approach for computational selective hearing based on the characteristics of the target speaker’s voice. SpeakerBeam requires only a small amount of speech data from the target speaker to compute his/her voice characteristics. It can then extract the speech of that speaker regardless of his/her position or the number of speakers talking in the background.
机译:在嘈杂的环境(例如鸡尾酒会)中,人们可以专注于聆听所需的说话者,这种能力称为选择性听力。为实现计算选择性听力而开发的当前方法需要知道目标说话者的位置,这限制了他们的实际使用。本文介绍了SpeakerBeam,这是一种基于深度学习的方法,可根据目标说话人语音的特征进行计算选择性听力。 SpeakerBeam仅需要来自目标说话者的少量语音数据即可计算其语音特性。然后,无论他/她的位置或在后台讲话的讲话者数量如何,它都可以提取该讲话者的语音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号