首页> 外文OA文献 >音声の雑音低減と音源分離に関する研究
【2h】

音声の雑音低減と音源分離に関する研究

机译:语音降噪与声源分离的研究

摘要

This thesis is concerned with noise reduction in single channel input case andspeech source separation in multi-channel input case. The source separation methodis also applied to speech source localization problem.Various methods for multi-channel speech separation have been reported. Somemethods use a technique where the input time domain signal is transformed into afrequency domain signal, and then a binary mask is applied to it. In these methods,both of information about the difference of magnitude and information aboutthe time delay difference are utilized for generating the binary mask. The presentmethod uses only one of the above two kinds of information. It is shown experimentallythat this approach is effective for improving the separation performance, andalso facilitates increasing the number of input channels.The source separation method is applied to the source localization problem, wheremultiple speakers in the three dimensional space are to be localized. Due to complexityof geometrical calculation, localization of multiple speakers is generally difficult.The method presented here uses different frequencies for different speakers, basedon the W-Disjoint Orthogonality assumption, to calculate the correlation function,yielding almost the same correlation function as that calculated for a single speaker.This method decomposes the multi-speaker localization problem into the source separationproblem and the single-speaker localization problem to make the problemeasier to solve.Among many approaches to single-channel noise reduction, there are ones thatutilize a small speech database. Conventionally, the magnitude of speech spectrumis stored in the database, discarding the phase information. In the present method,speech waveforms are stored in the database without transformation. Speech segmentsthat are similar to the noisy input are searched for in the database using thecorrelation as similarity measure, and concatenated to generate the output signal.Experimental results show that this method is effective for reducing a certain kindof noise.To improve the performance of the above noise reduction method, modificationsare made on the similarity measure. In the modified similarity measure, the correlationfunction is calculated using frequencies that carry mostly clean speech informationcontained in the speech database, frequencies occupied by noise being ignoredas much as possible. It is experimentally shown that this method enhances the noisereduction performance for real environmental noises as well as instrumental musicnoises.
机译:本文涉及单通道输入情况下的降噪和多通道输入情况下的语音源分离。源分离方法也被应用于语音源定位问题。已经报道了多种用于多通道语音分离的方法。某些方法使用一种技术,其中将输入时域信号转换为频域信号,然后对其应用二进制掩码。在这些方法中,关于幅度差的信息和关于时延差的信息两者都用于生成二进制掩码。本方法仅使用上述两种信息之一。实验表明,该方法不仅可以有效地提高分离性能,还可以增加输入通道的数量。声源分离方法应用于声源定位问题,需要在三维空间中对多个扬声器进行定位。由于几何计算的复杂性,通常难以对多个扬声器进行定位。此处介绍的方法基于W-Disjoint正交性假设,对不同扬声器使用不同的频率来计算相关函数,从而获得与为a计算的几乎相同的相关函数。该方法将多扬声器本地化问题分解为源分离问题和单扬声器本地化问题,使问题更易于解决。在许多单通道降噪方法中,有一些利用小型语音数据库。常规上,语音频谱的幅度存储在数据库中,丢弃相位信息。在本方法中,语音波形无需变换就存储在数据库中。使用相关度作为相似性度量在数据库中搜索与嘈杂输入相似的语音片段,并进行级联以生成输出信号。实验结果表明,该方法对于减少某种噪声是有效的。降噪方法,对相似度进行了修改。在改进的相似性度量中,使用携带语音数据库中包含的大部分干净语音信息的频率来计算相关函数,而噪声所占据的频率则被尽可能地忽略。实验表明,该方法增强了针对实际环境噪声以及乐器音乐噪声的降噪性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号