首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >'Hello? Who Am I Talking to?' A Shallow CNN Approach for Human vs. Bot Speech Classification
【24h】

'Hello? Who Am I Talking to?' A Shallow CNN Approach for Human vs. Bot Speech Classification

机译:“你好?我在和谁说话?”人类与机器人语音分类的一种浅层CNN方法

获取原文

摘要

Automatic speech generation algorithms, enhanced by deep learning techniques, enable an increasingly seamless and immediate machine-to-human interaction. As a result, the latest generation of phone-calling bots sounds more convincingly human than previous generations. The application of this technology has a strong social impact in terms of privacy issues (e.g., in customer-care services), fraudulent actions (e.g., social hacking) and erosion of trust (e.g., generation of fake conversation). For these reasons, it is crucial to identify the nature of a speaker, as either a human or a bot. In this paper, we propose a speech classification algorithm based on Convolutional Neural Networks (CNNs), which enables the automatic classification of human vs non-human speakers from the analysis of short audio excerpts. We evaluate the effectiveness of the proposed solution by exploiting a real human speech database populated with audio recordings from various sources, and automatically generated speeches using state-of-the-art text-to-speech generators based on deep learning (e.g., Google WaveNet).
机译:深度学习技术增强了自动语音生成算法,从而实现了越来越无缝和直接的机器与人之间的交互。结果,最新一代的电话呼叫机器人听起来比前几代更具说服力。这项技术的应用在隐私问题(例如,在客户服务中),欺诈行为(例如,社交黑客)和信任受到侵蚀(例如,产生虚假对话)方面具有强烈的社会影响。由于这些原因,至关重要的是要确定说话者的性质,无论是人类还是机器人。在本文中,我们提出了一种基于卷积神经网络(CNN)的语音分类算法,该算法可通过对简短的音频摘录进行分析,对人类和非人类说话者进行自动分类。我们通过利用填充有来自各种来源的录音的真实人类语音数据库,并使用基于深度学习的最新文本到语音生成器(例如Google WaveNet)来自动生成语音,来评估所提出解决方案的有效性)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号