首页> 外国专利> recording of data with segments of various acoustic environments

recording of data with segments of various acoustic environments

机译:记录各种声音环境的数据

摘要

A technique to improve the recognition accuracy when transcribing speech data that contains data from a wide range of environments. Input data in many situations contains data from a variety of sources in different environments. Such classes include: clean speech, speech corrupted by noise (e.g., music), non-speech (e.g., pure music with no speech), telephone speech, and the identity of a speaker. A technique is described whereby the different classes of data are first automatically identified, and then each class is transcribed by a system that is made specifically for it. The invention also describes a segmentation algorithm that is based on making up an acoustic model that characterizes the data in each class, and then using a dynamic programming algorithm (the viterbi algorithm) to automatically identify segments that belong to each class. The acoustic models are made in a certain feature space, and the invention also describes different feature spaces for use with different classes. IMAGE
机译:转录包含来自广泛环境的数据的语音数据时,提高识别准确性的技术。在许多情况下,输入数据包含来自不同环境中各种来源的数据。这些类别包括:干净的语音,被噪声破坏的语音(例如音乐),非语音(例如无语音的纯音乐),电话语音以及说话者的身份。描述了一种技术,通过该技术,首先可以自动识别不同类别的数据,然后通过专门为其制作的系统转录每个类别。本发明还描述了一种分割算法,该分割算法基于构成表征每个类别中的数据的声学模型,然后使用动态编程算法(维特比算法)来自动识别属于每个类别的片段。声学模型是在某个特征空间中制成的,并且本发明还描述了用于不同类别的不同特征空间。 <图像>

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号