首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Investigation of Methods to Improve the Recognition Performance of Tamil-English Code-Switched Data in Transformer Framework
【24h】

Investigation of Methods to Improve the Recognition Performance of Tamil-English Code-Switched Data in Transformer Framework

机译:在变压器框架中提高泰米尔语-英语代码交换数据的识别性能的方法的研究

获取原文

摘要

Code-switching (CS) refers to (inter/intra-word) switching between multiple languages in a single conversation. In multilingual countries like India, CS occurs very often in everyday speech, resulting in a new breed of languages in urban regions like Hinglish (Hindi-English), Tanglish (Tamil-English), etc. Research in Indic CS speech recognition is primarily affected by insufficient data. In this paper, we investigate methods to deal with such very low resource scenarios. Recently, Transformers have shown promising results on automatic speech recognition (ASR) tasks. In a Transformer based framework, we investigate two methods for Tamil-English CS speech recognition, namely, (i) well-trained encoders of Monolingual Transformers as feature extractors to provide language discrimination, (ii) language information as tokens at the targets. Our results show that CS is efficiently handled by the second method, while the first method was efficient in discriminating languages.
机译:代码切换(CS)是指单个对话中多种语言之间的(/内/内)切换。在像印度这样的多语种国家,CS经常发生在日常演讲中,导致城市地区的新品种,如HINGISH(Hindi-English),Tanglish(泰米尔英语)等。在CS语音识别中的研究主要受到影响数据不足。在本文中,我们调查了处理如此低的资源方案的方法。最近,变形金刚在自动语音识别(ASR)任务上显示了有希望的结果。在基于变压器的框架中,我们调查了泰米尔英语CS语音识别的两种方法,即我们的结果表明,CS通过第二种方法有效处理,而第一种方法以辨别语言有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号