Speaker Differentiation Using a Convolutional Autoencoder

机译：使用卷积自动编码器的说话人区分

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this work, a deep learning solution for differentiating speaker voices in audio given two microphone sources is presented as a step towards solving the cocktail party problem. A convolutional autoencoder was trained using a small sample size of data to associate audio snippets with categorical labels. Audio snippets collected as part of this work were used for training and evaluating the model. Audio was converted to mel-frequency cepstrum representation prior to classification. The collective processed data was labeled according to the person or collection of persons speaking. The model was trained and evaluated using data with two, three, four, five, and six categories. The result was a model that recognizes when different people are speaking in a 2-person, 3-person, 4-person, 5-person, and 6-person conversation with an accuracy of 99.29%, 97.62%, 96.43%, 93.43%, and 88.1%, respectively. Experimental comparisons between the five versions of the model are presented.

机译：在这项工作中，提出了一种深度学习解决方案，用于区分给定两个麦克风源的音频中的扬声器语音，以此作为解决鸡尾酒会问题的步骤。使用少量数据样本对卷积自动编码器进行了训练，以将音频片段与分类标签相关联。作为这项工作的一部分收集的音频片段被用于训练和评估模型。在分类之前，将音频转换为梅尔频率倒谱表示。根据说话的人或人的集合对处理过的集体数据进行标记。使用两个，三个，四个，五个和六个类别的数据对模型进行了训练和评估。结果是一个模型，该模型可以识别不同的人何时在2人，3人，4人，5人和6人对话中讲话，准确度为99.29 \％，97.62 \％，96.43 \％，分别为93.43 \％和88.1 \％。给出了模型的五个版本之间的实验比较。

著录项

来源
《IEEE International Conference on Rebooting Computing》|2018年|1-5|共5页
会议地点
作者
Mohamed Asni; Daniel Shapiro; Miodrag Bolic; Tony Mathew; Leor Grebler;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Microphones; Decoding; Training; Machine learning; Electrical engineering; Computational modeling;

机译：特征提取;麦克风;解码;培训;机械学习;电气工程;计算建模;

相似文献

外文文献
中文文献
专利

1. Graph Convolutional Autoencoder and Fully-Connected Autoencoder with Attention Mechanism Based Method for Predicting Drug-Disease Associations [J] . Xuan Ping, Gao Ling, Sheng Nan, Biomedical and Health Informatics, IEEE Journal of . 2021,第5期

机译：图表卷积式自动化器和全连接的AutoEncoder采用了基于注意机制的预测方法预测毒性疾病协会
2. Fault Diagnosis of Rotating Machinery under Noisy Environment Conditions Based on a 1-D Convolutional Autoencoder and 1-D Convolutional Neural Network [J] . Xingchen Liu, Qicai Zhou, Jiong Zhao, Sensors . 2019,第4期

机译：基于一维卷积自动编码器和一维卷积神经网络的嘈杂环境下旋转机械故障诊断
3. Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification [J] . Zhaofeng Zhang, Longbiao Wang, Atsuhiko Kai, EURASIP journal on audio, speech, and music processing . 2015,第1期

机译：基于深度神经网络的瓶颈特征和基于去噪自动编码器的去混响用于远距离说话者识别
4. Speaker Differentiation Using a Convolutional Autoencoder [C] . Mohamed Asni, Daniel Shapiro, Miodrag Bolic, IEEE International Conference on Rebooting Computing . 2018

机译：使用卷积的AutoEncoder扬声器差异化
5. Adaptive Mobile EEG Noise Cancellation Using 2D Convolutional Autoencoders for BCI Authentication [D] . Lewis, Tyree. 2021

机译：适用于使用2D卷积AutoEncoders进行BCI身份验证的自适应移动EEG噪声消除
6. Fault Diagnosis of Rotating Machinery under Noisy Environment Conditions Based on a 1-D Convolutional Autoencoder and 1-D Convolutional Neural Network [O] . Xingchen Liu, Qicai Zhou, Jiong Zhao, 2019

机译：基于一维卷积自动编码器和一维卷积神经网络的嘈杂环境下旋转机械故障诊断
7. CNNDLP: A Method Based on Convolutional Autoencoder and Convolutional Neural Network with Adjacent Edge Attention for Predicting lncRNA–Disease Associations [O] . Ping Xuan, Nan Sheng, Tiangang Zhang, 2019

机译：CNNDLP：一种基于卷积Autorencoder和卷积神经网络的方法，具有相邻边缘注意力，用于预测LNCRNA疾病关联

Speaker Differentiation Using a Convolutional Autoencoder

摘要

著录项

相似文献

相关主题

期刊订阅