首页> 外文会议>International Conference on Speech and Computer >Diarization of the Language Consulting Center Telephone Calls
【24h】

Diarization of the Language Consulting Center Telephone Calls

机译:语言咨询中心的日益衰退电话

获取原文

摘要

In this paper, we describe a diarization of the archive data from the project "Access to a Linguistically Structured Database of Enquiries from the Language Consulting Center". This project is attempting to provide improved access to the large archives of the Czech language of mainly telephone conversations collected continuously by The Language Consulting Center. One part of this archives contains mono recordings, where the data of the client and the language counsellor are mixed in one channel. In our proposed approach to a diarization, we used the information about the identity of the language counsellor acquired from the text transcription on the beginning of the conversation. For the initial stage of the diarization, our system based on clustering the x-vectors was adopted. The resegmentation step is used for refining the boundaries of speaker changes by the pre-trained Gaussian mixture model of the counsellor. Because of the uniqueness of our data, we compared our results with the Kaldi diarization as the baseline system.
机译:在本文中,我们描述了从项目中的存档数据的日复速度“访问语言咨询中心的语言结构化数据库”。该项目正在试图提供对捷克语大档案的主要档案,主要由语言咨询中心连续收集的电话交谈。此档案的一部分包含单声道录制,其中客户端和语言辅导员的数据在一个渠道中混合。在我们提出的日益增估方法中,我们在谈话开始时使用了关于从文本转录中获得的语​​言辅导员身份的信息。对于日复日期的初始阶段,采用了基于聚类X载体的系统。分解步骤用于通过辅导员预先培训的高斯混合模型改进扬声器变化的边界。由于我们数据的唯一性,我们将结果与Kaldi日记为基线系统进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号