...
首页> 外文期刊>Applied Acoustics >Identification/segmentation of indian regional languages with singular value decomposition based feature embedding
【24h】

Identification/segmentation of indian regional languages with singular value decomposition based feature embedding

机译:基于奇异值分解的印度区域语言的识别/分割基于基于特征的特征嵌入

获取原文
获取原文并翻译 | 示例
           

摘要

Language identification (LID) is identifying a language in a given spoken utterance. Language segmentation is equally important as language identification where language boundaries can be spotted in a multi-language utterance. Language identification could be a trivial front-end process for real-time mixed-speech recognition applications. India is a multilingual country and mixing two languages in a single conversation is very usual. In this paper, we have experimented with two schemes for language identification in Indian regional language context as very few works have been done. Singular value-based feature embedding is used for both of the schemes. In the first scheme, the singular value decomposition (SVD) is applied to the n-gram utterance matrix and in the second scheme, SVD is applied to the difference supervector matrix space. We have observed that in both the schemes, 55-65% singular value energy is sufficient to capture the language context. We have also seen how these two schemes are preserving language context. In n-gram based feature representation, we have seen that different skipgram models capture different language context. We have observed that for short test duration, supervector based feature representation is better but with a longer duration test signal, n-gram based feature performed better. We have also extended our work to explore language-based segmentation, where we have seen that segmentation accuracy of four language group with ten language training model, scheme-1 has performed well but with same four language training model, scheme-2 has shown better accuracy. In a multilingual language setup, the language-based identification and segmentation will be useful to identify the language as well as the duration of its presence. Further, the language-specific model can be used to identify the speech. (C) 2020 Elsevier Ltd. All rights reserved.
机译:语言识别(LID)是确定在给定的讲话发音的语言。语言分割是语言识别其中语言边界可以在多语言语句被发现同样重要。语言识别可以实时混合语音识别应用一个微不足道的前端工艺。印度是一个多语言的国家,在一个单一的谈话混合两种语言很平常。在本文中,我们有两个方案在印度地方语言上下文语言识别尝试,因为很少工程已经完成。奇异基于价值的功能嵌入用于两个方案。在第一方案中,奇异值分解(SVD)被施加到所述n-gram发声基质和在第二方案中,SVD被施加到超向量差矩阵空间。我们观察到,在这两个的方案,55%-65%的奇异值的能量足以捕捉到的语言环境。我们也已经看到了这两个方案都保留语境。在基于n元语法特征表示,我们已经看到,不同型号skipgram捕捉不同的语言环境。我们已经观察到,对于短的测试时间,基于超向量的特征表示是更好,但具有较长的持续时间的测试信号,基于n元语法的特征表现较好。我们还扩展了我们的工作,探索基于语言的分割,我们已经看到,有十个语言培训模型四项语言组的分割精度,方案-1已经表现不错,但有相同的四个语言培训模式,方案-2具有更好的显示准确性。在多语言的语言设置中,基于语言的识别和分割将是识别语言以及其存在的持续时间是有用的。此外,特定语言模型可用于识别语音。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号