首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >A Supervised Learning Approach to Monaural Segregation of Reverberant Speech
【24h】

A Supervised Learning Approach to Monaural Segregation of Reverberant Speech

机译:混响语音单声道隔离的一种监督学习方法

获取原文
获取原文并翻译 | 示例

摘要

A major source of signal degradation in real environments is room reverberation. Monaural speech segregation in reverberant environments is a particularly challenging problem. Although inverse filtering has been proposed to partially restore the harmonicity of reverberant speech before segregation, this approach is sensitive to specific source/receiver and room configurations. This paper proposes a supervised learning approach to monaural segregation of reverberant voiced speech, which learns to map from a set of pitch-based auditory features to a grouping cue encoding the posterior probability of a time-frequency (T-F) unit being target dominant given observed features. We devise a novel objective function for the learning process, which directly relates to the goal of maximizing signal-to-noise ratio. The models trained using this objective function yield significantly better T-F unit labeling. A segmentation and grouping framework is utilized to form reliable segments under reverberant conditions and organize them into streams. Systematic evaluations show that our approach produces very promising results under various reverberant conditions and generalizes well to new utterances and new speakers.
机译:真实环境中信号衰减的主要来源是房间混响。在混响环境中单声道语音隔离是一个特别具有挑战性的问题。尽管已经提出了逆滤波以在分离之前部分恢复混响语音的和声,但是这种方法对特定的源/接收器和房间配置很敏感。本文提出了一种有监督的学习方法,用于混响语音的单声道隔离,该方法学习从一组基于音高的听觉特征映射到一个分组提示,该提示编码时频(TF)单元成为目标优势的后验概率特征。我们为学习过程设计了一个新颖的目标函数,该函数直接关系到最大化信噪比的目标。使用该目标函数训练的模型可以显着改善T-F单元的标签。利用分段和分组框架在混响条件下形成可靠的分段并将其组织为流。系统的评估表明,我们的方法在各种混响条件下产生了非常有希望的结果,并且很好地推广到了新话语和新说话者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号