首页> 外国专利> METHOD FOR HYBRID GENERATIVE-DISCRIMINATIVE SEGMENTATION OF SPEAKERS IN AUDIO-FLOW

METHOD FOR HYBRID GENERATIVE-DISCRIMINATIVE SEGMENTATION OF SPEAKERS IN AUDIO-FLOW

机译：音频流中说话人的混合生成-区分分段方法

页面导航

摘要
著录项
相似文献

摘要

FIELD: information technology.;SUBSTANCE: verbal segments are extracted. Acoustic MFCC features of a vector are calculated. Each verbal segment is projected to the space EV of proper voices with a degree of 10 so that a set of Y vectors is obtained. Clustering centres C₁ and C₂ of the Y vectors are determined. Discriminative clustering is performed by calculation of parameters of planes H₁, H₂ and approximate determination of concentration areas of the Y vectors that are homogeneous as to speaker's information. Obtained data on the verbal segments are used for initialisation of VB diarisation based on a variation and Bayesian analysis. Marks of the segments as to the speakers during the whole pronouncing are obtained, on the basis of which correction of clustering centres C₁ and C₂ is performed; with that, operations of discriminative clustering, variation and Bayesian analysis and correction of clustering centres are performed subsequently at several iteration EV-VB stages. At each stage of iterations there performed is an analysis of complete segmentation as to the speakers, and at the absence of variations in segmentation on iteration it is stopped; after that, final segmentation representing the table correspondence between the verbal segments of an input signal and the speaker's index is obtained by Viterbi resegmentation.;EFFECT: improving accurate detection of a speaker for a dialogue in a telephone channel.;4 dwg, 1 tbl

机译：领域：信息技术;实质：提取言语片段。计算矢量的声学MFCC特征。每个言语片段以10度的程度投射到适当声音的空间EV，以便获得一组Y向量。确定Y向量的聚类中心C _{1 和C _{2 。通过计算平面H _{1 ，H _{2 的参数以及近似确定与说话者信息相同的Y向量的集中区域来进行判别聚类。基于变体和贝叶斯分析，将语音段上获得的数据用于VB Diarisation的初始化。在整个发音过程中获得说话者的片段标记，并据此对聚类中心C _{1 和C _{2 进行校正。这样，随后在多个迭代EV-VB阶段执行区分性聚类，变异和贝叶斯分析以及聚类中心的校正操作。在迭代的每个阶段，都会对说话人的完整细分进行分析，并且在迭代中没有细分变化的情况下，它将停止;之后，通过维特比重新分段获得代表输入信号的言语片段与说话者索引之间的表格对应关系的最终片段。效果：提高对说话者在电话通道中进行对话的准确检测; 4 dwg，1 tbl}}}}}}

著录项

公开/公告号RU2530314C1

专利类型
公开/公告日2014-10-10

原文格式PDF
申请/专利权人 OBSHCHESTVO S OGRANICHENNOJ OTVETSTVENNOSTJU TSRT-INNOVATSII;
展开▼

申请/专利号RU20130118633
发明设计人 PEKHOVSKIJ TIMUR SAKHIEVICH;SHULIPA ANDREJ KONSTANTINOVICH;KHITROV MIKHAIL VASILEVICH;
展开▼

申请日2013-04-23
分类号G10L15/00;
国家 RU
入库时间 2022-08-21 15:38:19

相似文献

专利
外文文献
中文文献