Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective

机译：基于深度学习的说话人分离相位重建：三角学视角

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This study investigates phase reconstruction for deep learning based monaural talker-independent speaker separation in the short-time Fourier transform (STFT) domain. The key observation is that, for a mixture oftwo sources, with their magnitudes accurately estimated and under a geometric constraint, the absolute phase difference between each source and the mixture can be uniquely determined; in addition, the source phases at each time-frequency T - F unit can be narrowed down to only two candidates. To pick the right candidate, we propose three algorithms based on iterative phase reconstruction, group delay estimation, and phase-difference sign prediction. State-of-the-art results are obtained on the publicly available wsj0-2mix and 3 mix corpus.

机译：这项研究调查了在短时傅立叶变换（STFT）域中基于深度学习的基于单声道不依赖于说话者的说话人分离的相位重建。关键的观察结果是，对于两个源的混合物，在准确估计其大小的情况下并在几何约束下，可以唯一确定每个源与混合物之间的绝对相位差。另外，每个时频TF单元的源相位可以缩小到只有两个候选。为了选择合适的候选人，我们提出了三种基于迭代相位重建，群时延估计和相位差符号预测的算法。最新的结果是通过可公开获得的wsj0-2mix和3 mix语料库获得的。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2019年|71-75|共5页
会议地点
作者
Zhong-Qiu Wang; Ke Tan; DeLiang Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Fourier transforms; iterative methods; learning (artificial intelligence); speaker recognition;

机译：傅里叶变换;迭代方法;学习（人工智能）;说话者识别;

相似文献

外文文献
中文文献
专利

1. Deep Learning Based Speech Separation via NMF-Style Reconstructions [J] . Shuai Nie, Shan Liang, Wenju Liu, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第11期

机译：通过NMF样式重构的基于深度学习的语音分离
2. SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker’s Voice Characteristics [J] . Marc Delcroix, Katerina Zmolikova, Keisuke Kinoshita, NTT Technical Review . 2018,第11期

机译：SpeakerBeam：一种新的深度学习技术，用于根据说话者的语音特征提取目标说话者的语音
3. Deep Learning for Talker-Dependent Reverberant Speaker Separation: An Empirical Study [J] . Delfarah Masood, Wang DeLiang Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第11期

机译：深度学习用于依赖于说话者的混响说话人分离：一项实证研究
4. Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective [C] . Zhong-Qiu Wang, Ke Tan, DeLiang Wang IEEE International Conference on Acoustics, Speech and Signal Processing . 2019

机译：扬声器分离的深度学习阶段重建：三角视角
5. Deep Learning Methods for Speaker Separation in Reverberant Conditions [D] . ?Delfarah, Masood 2019

机译：混响条件中扬声器分离的深度学习方法
6. Deep learning-based smart speaker to confirm surgical sites for cataract surgeries: A pilot study [O] . Tae Keun Yoo, Ein Oh, Hong Kyu Kim, 2020

机译：基于深度学习的智能扬声器以确认白内障手术的外科遗址：试点研究
7. Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective [O] . Zhong-Qiu Wang, Ke Tan, DeLiang Wang 2019

机译：扬声器分离的深度学习阶段重建：三角视角

Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅