Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering

V. Subba Ramaiah; R. Rajeswara Rao

首页> 外文期刊>International journal of speech technology >Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering

【24h】

Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering

机译：使用MKMFCC参数化和WLI-模糊聚类的说话人区分系统

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker diarization is the process of determining "who speak when?" with appropriate speaker labels with respect to the time regions where they spoke. Accordingly, in the previous work, a model based speaker diarization using the tangential weighted Mel frequency cepstral coefficients as the feature parameter for the voice activity detection and Lion optimization algorithm for the clustering of the audio streams into speaker group was performed. In this paper, speaker diarization system is proposed using multiple kernel weighted Mel frequency cepstral coefficient (MKMFCC) parameterization and Wu-and-Li Index (WLI)-fuzzy clustering. First, a MKMFCC which utilizes the multiple kernels like the tangential and exponential for weighting the MFCC's is proposed for the feature parameterization. Second, a clustering algorithm called the WLI-Fuzzy clustering is proposed for grouping the segments of the same speaker groups. The experimentation of the proposed speaker diarization system is carried out over the publically available ELSDSR corpus data set having the audio signal with seven different speakers. The performance evaluation of the proposed speaker diarization system is analysed using the measures such as diarization error rate, F-measure and false alarm rate. The results show that the proposed speaker diarization system proved better for tracking the active speakers from multiple speakers with improved tracking accuracy.

机译：说话人二语化是确定“谁在何时说话”的过程。并针对他们说话的时间区域使用适当的扬声器标签。因此，在先前的工作中，使用切向加权的梅尔频率倒谱系数作为用于语音活动检测的特征参数和用于将音频流聚类为说话者组的Lion优化算法，执行了基于模型的说话者二分法。本文提出了一种基于多核加权梅尔频率倒谱系数（MKMFCC）参数化和吴李指数（WLI）-模糊聚类的说话人区分系统。首先，提出了一种MKMFCC用于特征参数化，该MKMFCC利用诸如切线和指数的多个内核对MFCC进行加权。其次，提出了一种称为WLI-Fuzzy聚类的聚类算法，用于对同一说话者组的片段进行分组。在具有七个不同扬声器的音频信号的可公开获得的ELSDSR语料数据集上进行了所提出的扬声器区分系统的实验。通过使用诸如误差误差率，F-度量和误报率之类的措施，分析了所提出的说话人分离系统的性能评估。结果表明，所提出的说话人区分系统被证明更好地跟踪了来自多个说话者的活动说话者，具有更高的跟踪精度。

著录项

来源
《International journal of speech technology》 |2016年第4期|945-963|共19页
作者
V. Subba Ramaiah; R. Rajeswara Rao;
展开▼
作者单位

Mahatma Gandhi Institute of Technology, Kokapet, Hyderabad, Telangana 500075, India;

JNTUK, Kakinada, Andhra Pradesh 535002, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
WLI-fuzzy clustering; Multiple kernel; Bayesian Inference criterion; Voice activity detection; i-Vector extraction;

机译：WLI-模糊聚类;多核贝叶斯推理准则;语音活动检测;i-Vector提取;

相似文献

外文文献
中文文献
专利

1. A novel approach for speaker diarization system using TMFCC parameterization and Lion optimization [J] . V.Subba Ramaiah, R.Rajeswara Rao 中南大学学报（英文版） . 2017,第011期

机译：基于TMFCC参数化和Lion优化的说话人区分系统的新方法
2. Real-Time Implementation of Speaker Diarization System on Raspberry PI3 Using TLBO Clustering Algorithm [J] . Dabbabi Karim, Hajji Salah, Cherif Adnen Circuits, systems, and signal processing . 2020,第8期

机译：用TLBO聚类算法实时实施覆盆子PI3上的扬声器日复速度系统
3. Hybridization DE with K-means for speaker clustering in speaker diarization of broadcasts news [J] . Dabbabi Karim, Hajji Salah, Cherif Adnen International journal of speech technology . 2019,第4期

机译：与K-means的混合DE用于演讲者广播新闻的演讲者聚类
4. Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems [C] . Janez Zibert, France Mihelic Annual conference of the International Speech Communication Association;INTERSPEECH 2011 . 2011

机译：说话人差异化系统中说话人聚类的韵律和语音特征
5. Automatic Speaker Recognition and Diarization in Co-Channel Speech [D] . Shokouhi, Navid. 2017

机译：同频道语音中的说话人自动识别和区分
6. Supervised Speaker Diarization Using Random Forests: A Tool for Psychotherapy Process Research [O] . Lukas Fürer, Nathalie Schenk, Volker Roth, 2020

机译：使用随机森林监督扬声器日期：一种心理治疗过程研究的工具
7. A Novel Method for Selecting the Number of Clusters in a Speaker Diarization System [O] . Docio-Fernandez Laura, García Mateo Carmen, López Otero Paula 2014

机译：一种说话人差异化系统中簇数选择的新方法
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering

摘要

著录项

相似文献

相关主题

期刊订阅