首页> 外文期刊>International journal of speech technology >A pertinent learning machine input feature for speaker discrimination by voice
【24h】

A pertinent learning machine input feature for speaker discrimination by voice

机译:相关的学习机输入功能,可通过语音区分说话者

获取原文
获取原文并翻译 | 示例
       

摘要

This research work is a part of a global project of speech indexing entitled ISDS and concerns more particularly two machine learning classifier types: Neural Networks (NN) and Support Vector Machines (SVM), which are used by that project. However, in the present paper, we will only deal with the problem of speaker discrimination using a new relative reduced modelization for the speaker, restricting then our analysis to the new relative speaker characteristic used as input feature of the learning machines (NN and SVM). Speaker discrimination consists in checking whether two speech signals belong to the same speaker or not, by using some features of the speaker directly from his own speech. Our new proposed feature is based on a relative characterization of the speaker, called Relative Speaker Characteristic (RSC) and is well adapted for NN and SVM trainings. RSC consists in modeling one speaker relatively to another one, meaning that each speaker model is determined from both its speech signal and its dual speech. This investigation shows that the relative model, used as input of the classifier, optimizes the training, by speeding up the learning time and enhancing the discrimination accuracy of that classifier. Experiments of speaker discrimination are done on two different databases: Hub4 Broadcast-News database and a telephonic speech database, by using two learning machines: a Multi-Layer Perceptron (MLP) and a Support Vector Machines (SVM) with several input characteristics. Another comparative investigation is conducted by using two classical discriminative measures (Covariance-based mono-Gaussian distance and Kullback-Leibler distance) on the same databases. The originality of this relativist approach is that the new characteristic gives to the speaker a flexible model, since it changes every time that the competing speaker model changes. Results show that the new input characteristic is interesting in speaker discrimination. Furthermore, by using the Relative Speaker Characteristic, we reduce the size of the classifier input and the training time.
机译:这项研究工作是名为ISDS的全球语音索引项目的一部分,尤其涉及该项目使用的两种机器学习分类器类型:神经网络(NN)和支持向量机(SVM)。但是,在本文中,我们将仅使用针对说话者的新的相对简化模型来处理说话者歧视的问题,然后将我们的分析限制为用作学习机(NN和SVM)输入特征的新的相对说话者特征。说话者辨别在于通过直接从说话者自己的语音中使用说话者的某些特征来检查两个语音信号是否属于同一说话者。我们提出的新功能基于说话者的相对特征,称为相对说话者特征(RSC),非常适合NN和SVM训练。 RSC包括相对于另一个扬声器对一个扬声器建模,这意味着每个扬声器模型都是根据其语音信号和双重语音确定的。这项研究表明,相对模型用作分类器的输入,可通过加快学习时间并提高该分类器的判别准确性来优化训练。通过使用两个学习机:多层感知器(MLP)和具有多个输入特性的支持向量机(SVM),在两个不同的数据库上进行了说话人辨别实验:Hub4广播新闻数据库和电话语音数据库。通过在同一数据库上使用两种经典判别方法(基于协方差的单高斯距离和Kullback-Leibler距离)进行另一项比较研究。这种相对论方法的独创性在于,新特性为说话者提供了一个灵活的模型,因为它在竞争者说话者模型每次改变时都会改变。结果表明,新的输入特性在说话者辨别中很有趣。此外,通过使用相对说话者特征,我们减少了分类器输入的大小和训练时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号