首页> 外文学位 >Confidence measures as a search guide in speech recognition.
【24h】

Confidence measures as a search guide in speech recognition.

机译:置信度作为语音识别中的搜索指南。

获取原文
获取原文并翻译 | 示例

摘要

Despite the significant advances in speech and language technologies speech recognition systems are still not perfect. Every time a recognition hypothesis is produced, there is some degree of uncertainty inherent to it. Utterance verification was proposed as a backup technique used to verify the reliability of speech recognition results. This technique uses quantitative scores, such as confidence measures, to estimate the reliability of a recognition decision. However, this technique generally provides confidence information after the recognition phase. The maximum expected benefit is confined to the verification of the final decoding output without incorporating any mechanism for early detection and avoidance of these errors.; In this thesis an online confidence estimation and hypothesis verification approach is introduced. By incorporating confidence information early in the search phase the recognizer may be directed to the most promising paths, which may lead to more accurate final decoding result. For this purpose three techniques are proposed. The first technique is Confidence Based Pruning (CBP). In this technique confidence information plays the role of online filter that is applied to the word level partial hypotheses to make a decision of either considering them for future expansions or discarding them from the search space. The second technique is Confidence Based Language Modeling (CBLM). In this technique confidence information is used to adjust the score of the language model. This confidence based score tuning makes the language model score favored in regions of well matched acoustics, and make it plays a second fiddle when the acoustics are ambiguous. The main advantage of this technique is the minimization of the language overwhelming errors type. Usually these errors come in the form of word insertions that doesn't have any acoustic evidence. The third technique is Confidence Based Fast Match (CBFM). In this technique confidence information is used to look ahead in time and identify search extensions with poor acoustic score and discarded them before applying the expensive detailed match evaluation. With this technique a considerable amount of improvement in decoding speed can be achieved with very little sacrifice in accuracy.; Incorporating confidence measures in the decoding process enforce some constraints on the type of measures that can be used. The measure has to be computationally inexpensive so it doesn't affect the efficiency of the search process. Also it should be extracted synchronously from the on line information that are available during the search process. In this thesis the usage of two confidence measures, the posterior probability measure and the average base-phone rank measure, is investigated for on-line confidence estimation during the search process. Both of these two measures satisfy the efficiency and synchronization conditions. Moreover, they have the advantage of being derived only from the acoustic model therefore they can be used as a tuning parameter for the language model score in the proposed CBLM technique without suffering from circular reasoning. Also as these two measures are derived according to two different views of the acoustic scores, integrating them in a composite measure using a neural network is used to build a more robust measure.
机译:尽管语音和语言技术取得了重大进步,但语音识别系统仍不完善。每次产生识别假设时,都会固有某种程度的不确定性。语音验证被提出作为一种备份技术,用于验证语音识别结果的可靠性。该技术使用定量分数(例如置信度)来估计识别决策的可靠性。然而,该技术通常在识别阶段之后提供置信度信息。最大的预期收益仅限于最终解码输出的验证,而无需采用任何机制来及早发现和避免这些错误。本文介绍了一种在线置信度估计和假设验证方法。通过在搜索阶段的早期并入置信度信息,识别器可以被引导到最有希望的路径,这可以导致更准确的最终解码结果。为此,提出了三种技术。第一种技术是基于信任的修剪(CBP)。在这种技术中,置信度信息扮演着在线过滤器的角色,该过滤器应用于单词级别的部分假设,以决定是否考虑将其用于将来的扩展或将其从搜索空间中丢弃。第二种技术是基于信任的语言建模(CBLM)。在这种技术中,置信度信息用于调整语言模型的分数。这种基于置信度的乐谱调优使语言模型乐谱在声学效果良好的区域中受到青睐,并在声学模棱两可时起到了第二小提琴的作用。此技术的主要优点是使语言压倒性错误类型最小化。通常,这些错误以没有任何声音证据的单词插入形式出现。第三种技术是基于置信度的快速匹配(CBFM)。在此技术中,置信度信息用于及时向前看,并识别声学得分较差的搜索扩展,并在应用昂贵的详细匹配评估之前将其丢弃。利用这种技术,可以在不牺牲准确性的情况下实现解码速度的显着提高。将置信度度量纳入解码过程会对可以使用的度量类型施加一些约束。该度量必须在计算上便宜,因此不会影响搜索过程的效率。还应该从搜索过程中可用的在线信息中同步提取它。本文研究了两种置信度度量方法,即后验概率度量和平均基本电话等级度量,用于搜索过程中的在线置信度估计。这两种措施都满足效率和同步条件。此外,它们具有仅从声学模型中得出的优点,因此可以在所提出的CBLM技术中用作语言模型得分的调整参数,而无需进行循环推理。同样,由于这两个量度是根据声学得分的两个不同视图得出的,因此使用神经网络将它们集成到复合量度中可用于构建更可靠的量度。

著录项

  • 作者

    Abdou, Sherif Mahdy.;

  • 作者单位

    University of Miami.;

  • 授予单位 University of Miami.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2003
  • 页码 135 p.
  • 总页数 135
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号