ASR for Under-Resourced Languages From Probabilistic Transcription

Mark A. Hasegawa-Johnson; Preethi Jyothi; Daniel McCloy; Majid Mirbagheri; Giovanni M. di Liberto; Amit Das; Bradley Ekin; Chunxi Liu; Vimal Manohar; Hao Tang; Edmund C. Lalor; Nancy F. Chen; Paul Hager; Tyler Kekona; Rose Sloan; Adrian K. C. Lee

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >ASR for Under-Resourced Languages From Probabilistic Transcription

【24h】

ASR for Under-Resourced Languages From Probabilistic Transcription

机译：来自概率转录的资源不足语言的ASR

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In many under-resourced languages it is possible to find text, and it is possible to find speech, but transcribed speech suitable for training automatic speech recognition (ASR) is unavailable. In the absence of native transcripts, this paper proposes the use of a probabilistic transcript: A probability mass function over possible phonetic transcripts of the waveform. Three sources of probabilistic transcripts are demonstrated. First, self-training is a well-established semisupervised learning technique, in which a cross-lingual ASR first labels unlabeled speech, and is then adapted using the same labels. Second, mismatched crowdsourcing is a recent technique in which nonspeakers of the language are asked to write what they hear, and their nonsense transcripts are decoded using noisy channel models of second-language speech perception. Third, EEG distribution coding is a new technique in which nonspeakers of the language listen to it, and their electrocortical response signals are interpreted to indicate probabilities. ASR was trained in four languages without native transcripts. Adaptation using mismatched crowdsourcing significantly outperformed self-training, and both significantly outperformed a cross-lingual baseline. Both EEG distribution coding and text-derived phone language models were shown to improve the quality of probabilistic transcripts derived from mismatched crowdsourcing.

机译：在许多资源不足的语言中，可以找到文本，也可以找到语音，但是没有适合训练自动语音识别（ASR）的转录语音。在没有原始成绩单的情况下，本文建议使用概率成绩单：波形可能的语音成绩单上的概率质量函数。证明了成绩单的三个来源。首先，自我训练是一种行之有效的半监督学习技术，其中，跨语言ASR首先标记未标记的语音，然后使用相同的标记进行调整。其次，不匹配的众包是一种最近的技术，在这种技术中，不讲该语言的人被要求写出他们听到的内容，然后使用第二语言语音感知的嘈杂通道模型对他们的废话成绩单进行解码。第三，EEG分布编码是一种新技术，其中该语言的讲者听不到，并且将其皮层电响应信号解释为指示概率。 ASR接受了四种语言的培训，没有本地成绩单。使用不匹配的众包进行的适应明显优于自我训练，并且两者均明显优于跨语言的基线。 EEG分配编码和文本衍生的电话语言模型都可以提高因不匹配的众包而产生的概率成绩单的质量。

著录项

来源
《Audio, Speech, and Language Processing, IEEE/ACM Transactions on》 |2017年第1期|46-59|共14页
作者
Mark A. Hasegawa-Johnson; Preethi Jyothi; Daniel McCloy; Majid Mirbagheri; Giovanni M. di Liberto; Amit Das; Bradley Ekin; Chunxi Liu; Vimal Manohar; Hao Tang; Edmund C. Lalor; Nancy F. Chen; Paul Hager; Tyler Kekona; Rose Sloan; Adrian K. C. Lee;
展开▼
作者单位

University of Illinois at Urbana–Champaign, Champaign, IL, USA;

University of Illinois at Urbana–Champaign, Champaign, IL, USA;

University of Washington, Seattle, WA, USA;

University of Washington, Seattle, WA, USA;

Trinity College, University of Dublin, College Green, Dublin, Dublin, Ireland;

University of Illinois at Urbana–Champaign, Champaign, IL, USA;

University of Washington, Seattle, WA, USA;

Johns Hopkins University, Baltimore, MD, USA;

Johns Hopkins University, Baltimore, MD, USA;

Toyota Technological Institute Chicago, Chicago, IL, USA;

Trinity College, University of Dublin, College Green, Dublin, Dublin, Ireland;

Institute for Infocomm Research, Singapore;

Massachusetts Institute of Technology, Cambridge, MA, USA;

University of Washington, Seattle, WA, USA;

Columbia University, New York, NY, USA;

University of Washington, Seattle, WA, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech; Electroencephalography; Probabilistic logic; Crowdsourcing; Brain models; Artificial neural networks;

机译：言语;脑电图;概率逻辑;众包;脑模型;人工神经网络;

相似文献

外文文献
中文文献
专利

1. Automatic sub-word unit discovery and pronunciation lexicon induction for ASR with application to under-resourced languages [J] . Agenbag Wiehan, Niesler Thomas Computer speech and language . 2019,第SEPa期

机译：ASR的自动子词单元发现和发音词典归纳，并应用于资源不足的语言
2. Original Research Strategies for building wordnets for under-resourced languages: The case of African languages Crossref Citations [J] . Sonja E. Bosch, Marissa Griesel Literator . 2017,第1期

机译：为资源贫乏的语言建立词网的原始研究策略：以非洲语言为例Crossref引用
3. Mismatched Crowdsourcing based Language Perception for Under-resourced Languages [J] . Wenda Chen, Mark Hasegawa-Johnson, Nancy F. Chen Procedia Computer Science . 2016,第1期

机译：资源匮乏语言的基于众包的不匹配语言感知
4. Adapting ASR for under-resourced languages using mismatched transcriptions [C] . Chunxi Liu, Preethi Jyothi, Hao Tang, IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：使用不匹配的转录使ASR适应资源不足的语言
5. A foundation for general-purpose natural language generation: Sentence realization using probabilistic models of language. [D] . Langkilde-Geary, Irene. 2003

机译：通用自然语言生成的基础：使用语言的概率模型实现句子。
6. Cystic echinococcosis in marketed offal of sheep in Basrah Iraq: Abattoir-based survey and a probabilistic model estimation of the direct economic losses due to hydatid cyst [O] . Mohanad F. Abdulhameed, Ihab Habib, Suzan A. Al-Azizz, 2018

机译：伊拉克巴士拉羊内脏中的囊性棘球co虫病：基于屠宰场的调查和基于包虫囊肿的直接经济损失的概率模型估计
7. Enhancing ASR Systems for Under-Resourced Languages through a Novel Unsupervised Acoustic Model Training Technique [O] . CUCU, H., BUZO, A., BESACIER, L., 2015

机译：通过一种新的无监督声学模型训练技术增强资源不足语言的asR系统
8. Transcription Scheme for Languages Employing the Arabic Script Motivated by Speech Processing Application [R] . Ganjavi, S. , Georgiou, P. G. , Narayanan, S. 2004

机译：采用语音处理应用激发阿拉伯语脚本的语言转录方案

ASR for Under-Resourced Languages From Probabilistic Transcription

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅