首页> 外国专利> recording of data with segments of various acoustic environments

recording of data with segments of various acoustic environments

机译：记录各种声音环境的数据

页面导航

摘要
著录项
相似文献

摘要

A technique to improve the recognition accuracy when transcribing speech data that contains data from a wide range of environments. Input data in many situations contains data from a variety of sources in different environments. Such classes include: clean speech, speech corrupted by noise (e.g., music), non-speech (e.g., pure music with no speech), telephone speech, and the identity of a speaker. A technique is described whereby the different classes of data are first automatically identified, and then each class is transcribed by a system that is made specifically for it. The invention also describes a segmentation algorithm that is based on making up an acoustic model that characterizes the data in each class, and then using a dynamic programming algorithm (the viterbi algorithm) to automatically identify segments that belong to each class. The acoustic models are made in a certain feature space, and the invention also describes different feature spaces for use with different classes. IMAGE

机译：转录包含来自广泛环境的数据的语音数据时，提高识别准确性的技术。在许多情况下，输入数据包含来自不同环境中各种来源的数据。这些类别包括：干净的语音，被噪声破坏的语音（例如音乐），非语音（例如无语音的纯音乐），电话语音以及说话者的身份。描述了一种技术，通过该技术，首先可以自动识别不同类别的数据，然后通过专门为其制作的系统转录每个类别。本发明还描述了一种分割算法，该分割算法基于构成表征每个类别中的数据的声学模型，然后使用动态编程算法（维特比算法）来自动识别属于每个类别的片段。声学模型是在某个特征空间中制成的，并且本发明还描述了用于不同类别的不同特征空间。 <图像>

著录项

公开/公告号DE69722980D1

专利类型
公开/公告日2003-07-31

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORP. ARMONK;
展开▼

申请/专利号DE19976022980T
发明设计人 BAHL LALIT RAI;GOPALAKRISHNAN PONANI;GOPINATH RAMESH AMBAT;MAES STEPHANE HERMAN;PANMANABHAN MUKUND;POLYMENAKOS LAZAROS;
展开▼

申请日1997-01-17
分类号G10L15/20;G10L17/00;
国家 DE
入库时间 2022-08-21 23:39:07

相似文献

专利
外文文献
中文文献