首页> 外国专利> Speech-to-text processing based on a time-ordered classification of audio file segments

Speech-to-text processing based on a time-ordered classification of audio file segments

机译：基于音频文件片段的时间顺序分类的语音到文本处理

页面导航

摘要
著录项
相似文献

摘要

This specification describes technologies relating to multi core processing for parallel speech-to-text processing. In some implementations, a computer-implemented method is provided that includes the actions of receiving an audio file; analyzing the audio file to identify portions of the audio file as corresponding to one or more audio types; generating a time-ordered classification of the identified portions, the time-ordered classification indicating the one or more audio types and position within the audio file of each portion; generating a queue using the time-ordered classification, the queue including a plurality of jobs where each job includes one or more identifiers of a portion of the audio file classified as belonging to the one or more speech types; distributing the jobs in the queue to a plurality of processors; performing speech-to-text processing on each portion to generate a corresponding text file; and merging the corresponding text files to generate a transcription file.

机译：该规范描述了与多核处理相关的技术，用于并行语音到文本处理。在一些实施方式中，提供了一种计算机实施的方法，该方法包括接收音频文件的动作;分析音频文件以将音频文件的部分识别为与一种或多种音频类型相对应;生成所标识的部分的按时间排序的分类，该按时间排序的分类指示每个部分的音频文件中的一个或多个音频类型和位置;使用时间顺序分类生成队列，该队列包括多个作业，其中每个作业包括被分类为属于一种或多种语音类型的一部分音频文件的一个或多个标识符;将队列中的作业分配给多个处理器;对每个部分进行语音到文本的处理以生成相应的文本文件;合并相应的文本文件以生成转录文件。

著录项

公开/公告号US8423361B1

专利类型
公开/公告日2013-04-16

原文格式PDF
申请/专利权人 WALTER W. CHANG;MICHAEL J. WELCH;
展开▼

申请/专利号US201213420398
发明设计人 WALTER W. CHANG;MICHAEL J. WELCH;
展开▼

申请日2012-03-14
分类号G10L15/26;
国家 US
入库时间 2022-08-21 16:45:38

相似文献

专利
外文文献
中文文献