Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation

机译：无监督扬声器和面部模型创建的音频 - 视频扬声器绪化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Our goal is to create speaker models in audio domain and face models in video domain from a set of videos in an unsupervised manner. Such models can be used later for speaker identification in audio domain (answering the question "Who was speaking and when") and/or for face recognition ("Who was seen and when") for given videos that contain speaking persons. The proposed system is based on an audio-video diarization system that tries to resolve the disadvantages of the individual modalities. Experiments on broadcasts of Czech parliament meetings show that the proposed combination of individual audio and video diarization systems yields an improvement of the diarization error rate (DER).

机译：我们的目标是以无监督的方式从一组视频中创建音频域和面部模型中的音频域和面部模型。此类模型可以在稍后用于音频域中的扬声器识别（回答正在讲话的问题以及当“）和/或面部识别时（”谁和何时“），给定包含说话人的视频。所提出的系统基于音频 - 视频深度化系统，该系统试图解决各种方式的缺点。捷克议会会议的广播实验表明，所提出的个体音频和视频日益缓解系统的组合产生了深度缓释误差率（DER）的改善。

著录项

来源
《International Conference on Text, Speech and Dialogue》|2014年||共8页
会议地点
作者
Pavel Campr; Marie Kunesova; Jan Vanek; Jan Cech; Josef Psutka;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.1-53;
关键词
Audio-video speaker diarization; Audio speaker recognition; Face recognition;

机译：音频视频扬声器日复速度;音频扬声器识别;人脸识别;

相似文献

外文文献
中文文献
专利

1. Unsupervised help-trained LS-SVR-based segmentation in speaker diarization system [J] . Teimoori Farshad, Razzazi Farbod Multimedia Tools and Applications . 2019,第9期

机译：说话人区分系统中未经监督的，经过训练的基于LS-SVR的分割
2. Unsupervised help-trained LS-SVR-based segmentation in speaker diarization system [J] . Teimoori Farshad, Razzazi Farbod Multimedia Tools and Applications . 2019,第9期

机译：扬声器深度化系统中无监督的帮助训练的LS-SVR系列
3. Unsupervised deep feature embeddings for speaker diarization [J] . Rehan AHMAD, Syed ZUBAIR Turkish Journal of Electrical Engineering and Computer Sciences . 2019,第4期

机译：扬声器日益改估无监督的深度特征嵌入
4. Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation [C] . Pavel Campr, Marie Kunesova, Jan Vanek, International conference on text, speech and dialogue . 2014

机译：音频-视频扬声器的二值化，可实现无监督的扬声器和面部模型创建
5. Automatic Speaker Recognition and Diarization in Co-Channel Speech [D] . Shokouhi, Navid. 2017

机译：同频道语音中的说话人自动识别和区分
6. Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model [O] . Rehan Ahmad, Syed Zubair, Hani Alquhayz, 2019

机译：使用预训练的视听同步模型进行多模态扬声器二分法
7. Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation [O] . Pavel Campr, Marie Kunešová, Jan Vaněk, 2016

机译：用于无监督扬声器和人脸模型创建的音频 - 视频扬声器二值化

Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation

摘要

著录项

相似文献

相关主题

期刊订阅