首页> 外文会议>2017 International Conference on Wireless Communications, Signal Processing and Networking >Speaker identification using vector quantization and I-vector with reference to Assamese language
【24h】

Speaker identification using vector quantization and I-vector with reference to Assamese language

机译:参照阿萨姆语使用矢量量化和I矢量进行说话人识别

获取原文
获取原文并翻译 | 示例

摘要

This paper describes the implementation of a speaker identification system with reference to Assamese language. The database consists of speech samples that were collected from 15 (fifteen) speakers for ten Assamese words representing the Assamese digits from 0 (shounyo) to 9 (no). Mel Frequency Cepstral Coefficients (MFCC) are used as features for this study. Two independent speaker identification systems have been built in this paper using Vector Quantization (VQ) and I-vector technique. The system built using the I-vector technique obtains comparatively better identification accuracy for speakers when compared with the system developed using VQ technique. Three different systems have been built for both the techniques based on variable feature size. A maximum accuracy of 92.38% is achieved using I-vector technique with 39 MFCC features.
机译:本文介绍了参考阿萨姆语的说话人识别系统的实现。该数据库由语音样本组成,这些语音样本是从15位(十五位)说话者收集的10个阿萨姆语单词组成的,这些单词代表从0(shounyo)到9(no)的阿萨姆语数字。梅尔频率倒谱系数(MFCC)被用作本研究的特征。本文使用矢量量化(VQ)和I矢量技术建立了两个独立的说话人识别系统。与使用VQ技术开发的系统相比,使用I矢量技术构建的系统可获得相对更好的说话人识别精度。基于可变特征尺寸,针对这两种技术已经构建了三个不同的系统。使用具有39个MFCC功能的I矢量技术,可以达到92.38%的最大精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号