A pertinent learning machine input feature for speaker discrimination by voice

S. Ouamour; H. Sayoud

首页> 外文期刊>International journal of speech technology >A pertinent learning machine input feature for speaker discrimination by voice

【24h】

A pertinent learning machine input feature for speaker discrimination by voice

机译：相关的学习机输入功能，可通过语音区分说话者

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This research work is a part of a global project of speech indexing entitled ISDS and concerns more particularly two machine learning classifier types: Neural Networks (NN) and Support Vector Machines (SVM), which are used by that project. However, in the present paper, we will only deal with the problem of speaker discrimination using a new relative reduced modelization for the speaker, restricting then our analysis to the new relative speaker characteristic used as input feature of the learning machines (NN and SVM). Speaker discrimination consists in checking whether two speech signals belong to the same speaker or not, by using some features of the speaker directly from his own speech. Our new proposed feature is based on a relative characterization of the speaker, called Relative Speaker Characteristic (RSC) and is well adapted for NN and SVM trainings. RSC consists in modeling one speaker relatively to another one, meaning that each speaker model is determined from both its speech signal and its dual speech. This investigation shows that the relative model, used as input of the classifier, optimizes the training, by speeding up the learning time and enhancing the discrimination accuracy of that classifier. Experiments of speaker discrimination are done on two different databases: Hub4 Broadcast-News database and a telephonic speech database, by using two learning machines: a Multi-Layer Perceptron (MLP) and a Support Vector Machines (SVM) with several input characteristics. Another comparative investigation is conducted by using two classical discriminative measures (Covariance-based mono-Gaussian distance and Kullback-Leibler distance) on the same databases. The originality of this relativist approach is that the new characteristic gives to the speaker a flexible model, since it changes every time that the competing speaker model changes. Results show that the new input characteristic is interesting in speaker discrimination. Furthermore, by using the Relative Speaker Characteristic, we reduce the size of the classifier input and the training time.

机译：这项研究工作是名为ISDS的全球语音索引项目的一部分，尤其涉及该项目使用的两种机器学习分类器类型：神经网络（NN）和支持向量机（SVM）。但是，在本文中，我们将仅使用针对说话者的新的相对简化模型来处理说话者歧视的问题，然后将我们的分析限制为用作学习机（NN和SVM）输入特征的新的相对说话者特征。说话者辨别在于通过直接从说话者自己的语音中使用说话者的某些特征来检查两个语音信号是否属于同一说话者。我们提出的新功能基于说话者的相对特征，称为相对说话者特征（RSC），非常适合NN和SVM训练。 RSC包括相对于另一个扬声器对一个扬声器建模，这意味着每个扬声器模型都是根据其语音信号和双重语音确定的。这项研究表明，相对模型用作分类器的输入，可通过加快学习时间并提高该分类器的判别准确性来优化训练。通过使用两个学习机：多层感知器（MLP）和具有多个输入特性的支持向量机（SVM），在两个不同的数据库上进行了说话人辨别实验：Hub4广播新闻数据库和电话语音数据库。通过在同一数据库上使用两种经典判别方法（基于协方差的单高斯距离和Kullback-Leibler距离）进行另一项比较研究。这种相对论方法的独创性在于，新特性为说话者提供了一个灵活的模型，因为它在竞争者说话者模型每次改变时都会改变。结果表明，新的输入特性在说话者辨别中很有趣。此外，通过使用相对说话者特征，我们减少了分类器输入的大小和训练时间。

著录项

来源
《International journal of speech technology》 |2012年第2期|p.181-190|共10页
作者
S. Ouamour; H. Sayoud;
展开▼
作者单位

Institute of Electronics, USTHB University, Algiers, Algeria;

Institute of Electronics, USTHB University, Algiers, Algeria;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
speaker recognition; learning machine input; reduced features; neural networks; support vector machines; mono-gaussian measures; kullback-leibler distance;

机译：说话人识别;学习机输入;功能减少;神经网络;支持向量机;单高斯度量;kullback-leibler距离;
入库时间 2022-08-17 13:20:53

相似文献

外文文献
中文文献
专利

1. Pertinent Prosodic Features for Speaker Identification by Voice [J] . Halim Sayoud, Siham Ouamour International journal of mobile computing and multimedia communications . 2010,第2期

机译：语音识别说话人的相关韵律功能
2. Similar Speaker Selection Technique Based on Distance Metric Learning Using Highly Correlated Acoustic Features with Perceptual Voice Quality Similarity [J] . Yusuke IJIMA, Hideyuki MIZUNO IEICE transactions on information and systems . 2015,第1期

机译：基于具有高度相关声学特征且感知语音质量相似的距离度量学习的相似说话人选择技术
3. Multimodal Discrimination of Schizophrenia Using Hybrid Weighted Feature Concatenation of Brain Functional Connectivity and Anatomical Features with an Extreme Learning Machine [J] . Qureshi, Muhammad Naveed Iqbal Frontiers in Neuroinformatics . 2017,第2016期

机译：使用极端学习机使用脑功能连通性和解剖学特征的混合加权特征级联对精神分裂症进行多模态识别
4. A direct voice input man-machine interface strategy to provide voice access for severely impaired speakers [C] . Warner, A.G., Hughes, UK IT 1990 Conference . 1990

机译：直接语音输入人机界面策略，可为严重受损的扬声器提供语音访问
5. Towards Understanding Voice Discrimination Abilities of Humans and Machines [D] . Park, Soo Jin 2019

机译：努力理解人机的语音识别能力
6. Multimodal Discrimination of Schizophrenia Using Hybrid Weighted Feature Concatenation of Brain Functional Connectivity and Anatomical Features with an Extreme Learning Machine [O] . Muhammad Naveed Iqbal Qureshi, Jooyoung Oh, Dongrae Cho, 2013

机译：使用极端学习机使用脑功能连通性和解剖特征的混合加权特征级联来对精神分裂症进行多模态识别
7. Multimodal Discrimination of Schizophrenia Using Hybrid Weighted Feature Concatenation of Brain Functional Connectivity and Anatomical Features with an Extreme Learning Machine [O] . Muhammad Naveed Iqbal Qureshi, Jooyoung Oh, Dongrae Cho, 2017

机译：使用混合加权特征连接脑功能连接和解剖特征与极端学习机的精神分裂症多模式辨析

A pertinent learning machine input feature for speaker discrimination by voice

摘要

著录项

相似文献

相关主题

期刊订阅