A large-vocabulary continuous speech recognition system for Hindi

首页> 外文期刊>IBM Journal of Research and Development >A large-vocabulary continuous speech recognition system for Hindi

【24h】

A large-vocabulary continuous speech recognition system for Hindi

机译：用于印地语的大词汇量连续语音识别系统

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we present two new techniques that have been used to build a large-vocabulary continuous Hindi speech recognition system. We present a technique for fast bootstrapping of initial phone models of a new language. The training data for the new language is aligned using an existing speech recognition engine for another language. This aligned data is used to obtain the initial acoustic models for the phones of the new language. Following this approach requires less training data. We also present a technique for generating baseforms (phonetic spellings) for phonetic languages such as Hindi. As is inherent in phonetic languages, rules generally capture the mapping of spelling to phonemes very well. However, deep linguistic knowledge is required to write all possible rules, and there are some ambiguities in the language that are difficult to capture with rules. On the other hand, pure statistical techniques for baseform generation require large amounts of training data, which is not readily available. We propose a hybrid approach that combines rule-based and statistical approaches in a two-step fashion. We evaluate the performance of the proposed approaches through various phonetic classification and recognition experiments.

机译：在本文中，我们介绍了已用于构建大词汇量连续印地语语音识别系统的两种新技术。我们提出了一种快速引导新语言的初始电话模型的技术。使用另一种语言的现有语音识别引擎来对齐新语言的训练数据。该对齐的数据用于获取新语言电话的初始声学模型。遵循这种方法需要较少的训练数据。我们还介绍了一种为印地语等语音语言生成基本形式（语音拼写）的技术。正如语音语言所固有的那样，规则通常很好地捕获了拼写与音素的映射。但是，需要具备深厚的语言知识才能编写所有可能的规则，并且在语言中存在一些歧义，很难用规则来捕捉。另一方面，用于基础形式生成的纯统计技术需要大量的训练数据，而这是不容易获得的。我们提出了一种混合方法，该方法以两步方式结合了基于规则的方法和统计方法。我们通过各种语音分类和识别实验评估提出的方法的性能。

著录项

来源
《IBM Journal of Research and Development》 |2004年第5期|P.703-715|共13页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A large-vocabulary continuous speech recognition system for Hindi [J] . Kumar M, Rajput N, Verma A IBM Journal of Research and Development . 2004,第5a6期

机译：用于印地语的大词汇量连续语音识别系统
2. Large-Vocabulary Continuous Speech Recognition Systems: A Look at Some Recent Advances [J] . Saon G., Chien J.-T. Signal Processing Magazine, IEEE . 2012,第6期

机译：大词汇量连续语音识别系统：最近的一些进展
3. A VLSI grammar processing subsystem for a real-time large-vocabulary continuous speech recognition system [J] . Chen D.C., Yu R. IEEE Journal of Solid-State Circuits . 1991,第3期

机译：实时大词汇量连续语音识别系统的VLSI语法处理子系统
4. Parallelized Viterbi Processor for 5,000-Word Large-Vocabulary Real-Time Continuous Speech Recognition FPGA System [C] . Tsuyoshi Fujinaga, Kazuo Miura, Hiroki Noguchi, International Speech Communication Association . 2009

机译：平行化维特比处理器5,000字大词汇实时连续语音识别FPGA系统
5. Large-vocabulary speaker-independent continuous speech recognition: The SPHINX system. [D] . Lee, Kai-Fu. 1988

机译：独立于大词汇的说话者的连续语音识别：SPHINX系统。
6. Evaluation of the accuracy of a continuous speech recognition software system in radiology [O] . Kalpana M. Kanal, Nicholas J. Hangiandreou, Anne-Marie G. Sykes, 2000

机译：放射学中连续语音识别软件系统准确性的评估
7. A Large-Vocabulary Continuous Speech Recognition Algorithm and its Application to a Multi-modal Telephone Directory Assistance System [O] . Yasuhiro Minami, Kiyohiro Shikano, Osamu Yoshioka, 1997

机译：大词汇量连续语音识别算法及其在多模式电话簿协助系统中的应用

A large-vocabulary continuous speech recognition system for Hindi

摘要

著录项

相似文献

相关主题

期刊订阅