首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Tracking of Multiple Fundamental Frequencies in Diplophonic Voices
【24h】

Tracking of Multiple Fundamental Frequencies in Diplophonic Voices

机译:外交声中多个基本频率的跟踪

获取原文
获取原文并翻译 | 示例

摘要

Diplophonia is a type of pathological voice in which two fundamental frequencies (fo) are present simultaneously. Specialized audio analyzers that can handle up to two fos in diplophonic voices are in their infancy. We propose the tracking of up to two fos in diplophonic voices by audio waveform modeling (AWM), which involves obtaining candidates by repetitive execution of the Viterbi algorithm, followed by waveform Fourier synthesis, and heuristic candidate selection with majority voting. Our approach is evaluated with reference fo-tracks obtained from laryngeal highspeed videos of 29 sustained phonations and compared to state-of-the-art tracking algorithms for multiple fos. An accurate and a fast variant of our algorithm are tested. The median error rate of the accurate variant is 6.52%, whereas the most accurate benchmark achieves 11.11%. The fast variant is more than twice as fast as the fastest relevant benchmark, and the median error rate is 9.52%. Furthermore, illustrative results of connected speech analysis are reported. Our approach may help to improve detection and analysis of diplophonia in clinical research and practice, as well as to advance synthesis of disordered voices.
机译:Diplophonia是一种病理性语音,其中两个基本频率(f o )同时出现。专门的音频分析仪尚处于起步阶段,最多可以处理两个语音中的f o 。我们建议通过音频波形建模(AWM)跟踪双声语音中的最多两个f o ,这涉及通过重复执行Viterbi算法获得候选,然后进行波形傅立叶合成和启发式候选多数投票的选择。我们的方法是通过参考f o 音轨进行评估的,该音轨是从29个持续发声的喉部高速视频中获得的,并与针对多个f o s的最新跟踪算法进行了比较。测试了我们算法的准确且快速的变体。准确变量的中位数错误率为6.52%,而最准确的基准为11.11%。快速变体的速度是最快的相关基准的两倍以上,中位数错误率为9.52%。此外,报告了连接语音分析的说明性结果。我们的方法可能有助于在临床研究和实践中改进对双声的检测和分析,以及促进混乱声音的合成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号