首页> 外国专利> PERMUTATION INVARIANT TRAINING FOR TALKER-INDEPENDENT MULTI-TALKER SPEECH SEPARATION

PERMUTATION INVARIANT TRAINING FOR TALKER-INDEPENDENT MULTI-TALKER SPEECH SEPARATION

机译：无需说话人的多说话人语音分离的置换不变训练

页面导航

摘要
著录项
相似文献

摘要

The techniques described herein improve methods to equip a computing device to conduct automatic speech recognition (“ASR”) in talker-independent multi-talker scenarios. In some examples, permutation invariant training of deep learning models can be used for talker-independent multi-talker scenarios. In some examples, the techniques can determine a permutation-considered assignment between a model's estimate of a source signal and the source signal. In some examples, the techniques can include training the model generating the estimate to minimize a deviation of the permutation-considered assignment. These techniques can be implemented into a neural network's structure itself, solving the label permutation problem that prevented making progress on deep learning based techniques for speech separation. The techniques discussed herein can also include source tracing to trace streams originating from a same source through the frames of a mixed signal.

机译：本文描述的技术改进了使计算设备配备在与讲话者无关的多讲话者场景中进行自动语音识别（“ ASR”）的方法。在一些示例中，深度学习模型的置换不变训练可以用于与说话者无关的多说话者场景。在一些示例中，该技术可以确定源信号的模型估计与源信号之间的考虑置换的分配。在一些示例中，该技术可以包括训练生成估计的模型以最小化考虑置换的分配的偏差。这些技术可以实现为神经网络本身的结构，解决了标签置换问题，该问题使基于深度学习的语音分离技术无法取得进展。本文讨论的技术还可以包括源跟踪，以通过混合信号的帧跟踪源自同一源的流。

著录项

公开/公告号EP3459077A1

专利类型
公开/公告日2019-03-27

原文格式PDF
申请/专利权人 MICROSOFT TECHNOLOGY LICENSING LLC;
展开▼

申请/专利号EP20170726742
发明设计人 YU DONG;
展开▼

申请日2017-05-06
分类号G10L21/0272;
国家 EP
入库时间 2022-08-21 12:27:19

相似文献

专利
外文文献
中文文献