首页> 外国专利> Diarization driven by meta-information identified in discussion content

Diarization driven by meta-information identified in discussion content

机译：讨论内容中确定的元信息驱动的差异化

页面导航

摘要
著录项
相似文献

摘要

An approach is provided that receives an audio stream and utilizes a voice activation detection (VAD) process to create a digital audio stream of voices from at least two different speakers. An automatic speech recognition (ASR) process is applied to the digital stream with the ASR process resulting in the spoken words to which a speaker turn detection (STD) process is applied to identify a number of speaker segments with each speaker segment ending at a word boundary. The STD process analyzes a number of speaker segments using a language model that determines when speaker changes occur. A speaker clustering algorithm is then applied to the speaker segments to associate one of the speakers with each of the speaker segments.

机译：提供了一种方法，该方法接收音频流并利用语音激活检测（VAD）过程来创建来自至少两个不同扬声器的语音数字音频流。将自动语音识别（ASR）过程应用于数字流，并通过该ASR过程产生口语单词，对其应用扬声器转向检测（STD）过程来识别多个扬声器片段，每个扬声器片段以一个单词结尾边界。 STD过程使用确定演讲者何时发生变化的语言模型来分析多个演讲者片段。然后，将说话者聚类算法应用于说话者片段，以将一个说话者与每个说话者片段相关联。

著录项

公开/公告号US10468031B2

专利类型
公开/公告日2019-11-05

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US201715819158
发明设计人 KENNETH W. CHURCH;DIMITRIOS B. DIMITRIADIS;PETR FOUSEK;MIROSLAV NOVAK;GEORGE A. SAON;
展开▼

申请日2017-11-21
分类号G10L17;G10L15/22;G10L15/30;G10L15/183;G10L25/51;G10L25/78;
国家 US
入库时间 2022-08-21 12:13:55

相似文献

专利
外文文献
中文文献