首页> 外文期刊>IEEE transactions on audio, speech and language processing >Computationally Efficient and Robust BIC-Based Speaker Segmentation
【24h】

Computationally Efficient and Robust BIC-Based Speaker Segmentation

机译:基于BIC的计算有效且鲁棒的说话人细分

获取原文
获取原文并翻译 | 示例

摘要

An algorithm for automatic speaker segmentation based on the Bayesian information criterion (BIC) is presented. BIC tests are not performed for every window shift, as previously, but when a speaker change is most probable to occur. This is done by estimating the next probable change point thanks to a model of utterance durations. It is found that the inverse Gaussian fits best the distribution of utterance durations. As a result, less BIC tests are needed, making the proposed system less computationally demanding in time and memory, and considerably more efficient with respect to missed speaker change points. A feature selection algorithm based on branch and bound search strategy is applied in order to identify the most efficient features for speaker segmentation. Furthermore, a new theoretical formulation of BIC is derived by applying centering and simultaneous diagonalization. This formulation is considerably more computationally efficient than the standard BIC, when the covariance matrices are estimated by other estimators than the usual maximum-likelihood ones. Two commonly used pairs of figures of merit are employed and their relationship is established. Computational efficiency is achieved through the speaker utterance modeling, whereas robustness is achieved by feature selection and application of BIC tests at appropriately selected time instants. Experimental results indicate that the proposed modifications yield a superior performance compared to existing approaches.
机译:提出了一种基于贝叶斯信息准则(BIC)的自动说话人分割算法。像以前一样,并不是每个窗口偏移都进行BIC测试,而是在说话者最有可能发生变化时进行BIC测试。这是通过使用语音持续时间模型估算下一个可能的变化点来完成的。发现反向高斯最适合发声持续时间的分布。结果,需要较少的BIC测试,从而使所提出的系统在时间和存储上的计算需求更少,并且相对于遗漏的扬声器更改点而言效率更高。为了识别说话人分割最有效的特征,应用了基于分支和边界搜索策略的特征选择算法。此外,通过应用定心和同时对角线化,得出了BIC的新理论公式。当协方差矩阵是由其他估计量估计的,而不是通常的最大似然估计量时,此公式比标准BIC的计算效率高得多。使用两个常用的品质因数对,并建立它们之间的关系。计算效率是通过说话者话语建模来实现的,而鲁棒性是通过特征选择和在适当选择的时刻进行BIC测试来实现的。实验结果表明,与现有方法相比,所提出的修改具有更高的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号