首页> 外文会议>Signal Processing, Sensor/Information Fusion, and Target Recognition XXVI >Speaker Identification for the Improvement of the Security Communication between Law Enforcement Units
【24h】

Speaker Identification for the Improvement of the Security Communication between Law Enforcement Units

机译:确定演讲者,以改善执法部门之间的安全沟通

获取原文
获取原文并翻译 | 示例

摘要

This article discusses the speaker identification for the improvement of the security communication between law enforcement units. The main task of this research was to develop the text-independent speaker identification system which can be used for real-time recognition. This system is designed for identification in the open set. It means that the unknown speaker can be anyone. Communication itself is secured, but we have to check the authorization of the communication parties. We have to decide if the unknown speaker is the authorized for the given action. The calls are recorded by IP telephony server and then these recordings are evaluate using classification If the system evaluates that the speaker is not authorized, it sends a warning message to the administrator. This message can detect, for example a stolen phone or other unusual situation. The administrator then performs the appropriate actions. Our novel proposal system uses multilayer neural network for classification and it consists of three layers (input layer, hidden layer, and output layer). A number of neurons in input layer corresponds with the length of speech features. Output layer then represents classified speakers. Artificial Neural Network classifies speech signal frame by frame, but the final decision is done over the complete record. This rule substantially increases accuracy of the classification. Input data for the neural network are a thirteen Mel-frequency cepstral coefficients, which describe the behavior of the vocal tract. These parameters are the most used for speaker recognition. Parameters for training, testing and validation were extracted from recordings of authorized users. Recording conditions for training data correspond with the real traffic of the system (sampling frequency, bit rate). The main benefit of the research is the system developed for text-independent speaker identification which is applied to secure communication between law enforcement units.
机译:本文讨论说话者识别,以改善执法部门之间的安全通信。这项研究的主要任务是开发可用于实时识别的与文本无关的说话人识别系统。该系统设计用于在开放集中进行识别。这意味着未知的讲话者可以是任何人。通信本身是安全的,但是我们必须检查通信方的授权。我们必须确定未知说话者是否被授权执行给定操作。呼叫由IP电话服务器记录,然后使用分类评估这些记录。如果系统评估扬声器未获得授权,则会向管理员发送警告消息。此消息可以检测到例如手机被盗或其他异常情况。管理员然后执行适当的操作。我们新颖的提议系统使用多层神经网络进行分类,它由三层组成(输入层,隐藏层和输出层)。输入层中的许多神经元与语音特征的长度相对应。然后,输出层代表分类的说话者。人工神经网络将语音信号逐帧分类,但是最终决定是在完整记录上完成的。该规则大大提高了分类的准确性。神经网络的输入数据是13个梅尔频率倒谱系数,用于描述声道的行为。这些参数最适合说话者识别。从授权用户的记录中提取了用于训练,测试和验证的参数。训练数据的记录条件与系统的实际流量(采样频率,比特率)相对应。该研究的主要好处是为独立于文本的说话人识别开发的系统,该系统可用于确保执法部门之间的安全通信。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号