首页> 外文期刊>ACM transactions on Asian and low-resource language information processing >Approaches for Multilingual Phone Recognition in Code-switched and Non-code-switched Scenarios Using Indian Languages
【24h】

Approaches for Multilingual Phone Recognition in Code-switched and Non-code-switched Scenarios Using Indian Languages

机译:使用印度语言的代码切换和非码切换方案中的多语言电话识别方法

获取原文
获取原文并翻译 | 示例

摘要

In this study, we evaluate and compare two different approaches for multilingual phone recognition in code-switched and non-code-switched scenarios. First approach is a front-end Language Identification (LID)switched to a monolingual phone recognizer (LID-Mono), trained individually on each of the languages present in multilingual dataset. In the second approach, a common multilingual phone-set derived from the International Phonetic Alphabet (IPA) transcription of the multilingual dataset is used to develop a Multilingual Phone Recognition System (Multi-PRS). The bilingual code-switching experiments are conducted using Kannada and Urdu languages. In the first approach, LID is performed using the state-of-the-art i-vectors. Both monolingual and multilingual phone recognition systems are trained using Deep Neural Networks. The performance of LID-Mono and Multi-PRS approaches are compared and analysed in detail. It is found that the performance of Multi-PRS approach is superior compared to more conventional LID-Mono approach in both code-switched and non-code-switched scenarios. For code-switched speech, the effect of length of segments (that are used to perform LID) on the performance of LID-Mono system is studied by varying the window size from 500 ms to 5.0 s, and full utterance. The LID-Mono approach heavily depends on the accuracy of the LID system and the LID errors cannot be recovered. But, the Multi-PRS system by virtue of not having to do a front-end LID switching and designed based on the common multilingual phone-set derived from several languages, is not constrained by the accuracy of the LID system, and hence performs effectively on code-switched and non-code-switched speech, offering low Phone Error Rates than the LID-Mono system.
机译:在这项研究中,我们在代码交换和非码切换方案中评估和比较两种不同的多语言电话识别方法。第一种方法是切换到单机电话识别器(LID-Mono)的前端语言识别(盖子),在多语言数据集中的每个语言上单独培训。在第二种方法中,源自多语言数据集的国际语音字母表(IPA)转录的常见多语言电话组用于开发多语言电话识别系统(多条PRS)。双语代码切换实验是使用Kannada和Urdu语言进行的。在第一种方法中,使用最先进的I载体执行盖子。单声道和多语言电话识别系统都使用深神经网络训练。比较和分析盖子单声道和多条PRS方法的性能。结果发现,与代码交换和非码切换方案中的更多传统的盖子 - 单声道方法相比,多条PRS方法的性能优异。对于代码切换语音,通过从500 ms到5.0 s的窗口大小改变窗口大小,以及完全话语来研究段长度(用于执行盖子)对盖子系统的性能的效果。 LID-MONO方法大大取决于盖子系统的准确性,并且无法恢复盖子误差。但是,通过不必不必根据源自多种语言衍生的公共多语言电话机设计的多条PRS系统,不受盖系统的准确性的限制,因此有效地执行在代码切换和非码切换语音上,提供比LID-Mono系统的低电平错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号