Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech

机译：电话语音说话人差异化中的神经网络说话人描述符

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we have been investigating an approach to a speaker representation for a diarization system that clusters short telephone conversation segments (produced by the same speaker). The proposed approach applies a neural-network-based descriptor that replaces a usual i-vector descriptor in the state-of-the-art diarization systems. The comparison of these two techniques was done on the English part of the CallHome corpus. The final results indicate the superiority of the i-vector's approach although our proposed descriptor brings an additive information. Thus, the combined descriptor represents a speaker in a segment for diarization purpose with lower diarization error (almost 20% relative improvement compared with only i-vector application).

机译：在本文中，我们一直在研究一种将简短的电话会话段（由同一说话人产生的声音）聚类的差异化系统中说话人表示的方法。所提出的方法应用了基于神经网络的描述符，该描述符替代了最新的数字化系统中常用的i-vector描述符。这两种技术的比较是在CallHome语料库的英语部分进行的。尽管我们提出的描述符带来了附加信息，但最终结果表明了i-vector方法的优越性。因此，组合的描述符表示用于分割目的的片段中的说话者，并且具有较低的分割误差（与仅i-vector应用相比，相对改善了将近20％）。

著录项

来源
《International Conference on speech and computer》|2017年|555-563|共9页
会议地点
作者
Zbynek Zajic; Jan Zelinka; Ludek Mueller;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Neural network; Speaker diarization; i-Vector;

机译：神经网络;说话人差异化;我矢量;

相似文献

外文文献
中文文献
专利

1. Speaker diarization using autoassociative neural networks [J] . S. Jothilakshmi, V. Ramalingam, S. Palanivel Engineering Applications of Artificial Intelligence . 2009,第4a5期

机译：使用自联想神经网络进行说话人区分
2. On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/ Speech Video Soundtracks [J] . Robert Mertens, Po-Sen Huang, Luke Gottlieb, International journal of multimedia data engineering & management . 2012,第3期

机译：说话者差异化在非语音和非语音/语音混合视频音轨的音频索引中的适用性
3. Speaker/Style-Dependent Neural Network Speech Synthesis Based on Speaker/Style Embedding [J] . Milan Se?ujski, Darko Pekar, Sini?a Suzi?, Journal of Universal Computer Science . 2020,第4期

机译：基于扬声器/风格嵌入的扬声器/型依赖神经网络语音合成
4. Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech [C] . Zbynek Zajic, Jan Zelinka, Ludek Muller International Conference on Speech and Computer . 2017

机译：电话言语扬声器简化中的神经网络扬声器描述符
5. Automatic Speaker Recognition and Diarization in Co-Channel Speech [D] . Shokouhi, Navid. 2017

机译：同频道语音中的说话人自动识别和区分
6. Speaker-Independent Silent Speech Recognition from Flesh-Point Articulatory Movements Using an LSTM NeuralNetwork [O] . Myungjong Kim, Beiming Cao, Ted Mau, -1

机译：使用LSTM神经从肉点发音运动中独立于说话者的沉默语音识别网络
7. Speaker Diarization using Deep Recurrent Convolutional Neural Networks for Speaker Embeddings [O] . Cyrta, Pawel, Trzciński, Tomasz, Stokowiec, Wojciech 2017

机译：使用深度递归卷积神经网络的扬声器二值化用于扬声器嵌入
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Neural Network Speaker Descriptor in Speaker Diarization of Telephone Speech

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅