首页> 外文期刊>International journal of speech technology >Detection of interactive voice response (IVR) in phone call records
【24h】

Detection of interactive voice response (IVR) in phone call records

机译:检测电话呼叫记录中的交互式语音响应(IVR)

获取原文
获取原文并翻译 | 示例
       

摘要

Separation of pre-recorded messages (Interactive Voice Response, IVR) from live speech fragments in real-time plays a significant role in speech emotion recognition (SER) systems, unwanted calls filtering, automatic detection of answering machine responses, reduction of stored record sizes, voice mail spam filtration, etc. The problem complexity is that, unlike with silent, music, and noise fragments studied by the conventional voice activity recognition (VAD), IVR usually contains speech. Three classifiers for live speech fragments detection in phone call records are considered: based on the support vector machine (SVM), gradient boosting (XGBoost) and convolutional neural network (CNN). The Geneva Minimalistic Acoustic Parameter Set for XGBoost and SVM, and log-spectrograms and gammatonegrams for CNN were used for feature representation of audio fragments. Experiments with a dataset of phone calls demonstrate comparable quality (around 0.96 according to the Fl-averaged measure) of the considered algorithms with CNN having a advantage (0.98).
机译:实时语音碎片中预先记录的消息(交互式语音响应,IVR)的分离在语音情感识别(SER)系统中发挥着重要作用,不需要的呼叫过滤,自动检测应答机响应,减少存储的记录尺寸,语音邮件垃圾邮件过滤等。问题复杂性是,与传统语音活动识别(VAD)研究的静音,音乐和噪声片段不同,IVR通常包含语音。考虑到在电话记录中检测的三个分类器:基于支持向量机(SVM),梯度升压(XGBoost)和卷积神经网络(CNN)。用于XGBoost和SVM的日内瓦简约声学参数,以及用于CNN的Log-谱图和CNN的γ谱图用于音频碎片的特征表示。使用电话数据集的实验表明了具有优势的CNN的CNN的相当的质量(根据FL平均量度的0.96),具有优势(0.98)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号