首页> 中文期刊> 《计算机应用与软件》 >基于图像可听化的视听信息融合方法研究

基于图像可听化的视听信息融合方法研究

         

摘要

While studying the traditional speech recognition system with audio-video dual mode , we found that the visual characteristics after image processing have the problems of large amount of data and important characteristics lost .Aiming at these problems , we plan to apply image sonification technology to extracting the characteristics of video image .By using BP neural network in genetic algorithm optimisation as the fusion model , we fuse the characteristics of audio and video at feature level .Experimental results show that , after being processed by the image sonification , the visual characteristics contain certain speech information , its recognition effect is stable in noise environment as well .The fusion model of neural network improves the robustness of the system .%在传统的视听双模态语音识别系统的研究中,经图像处理后的视觉特征往往具有数据量大、重要特征丢失等问题。针对这些问题,拟采用图像可听化技术对视频图像进行特征提取。以遗传算法优化的BP神经网络为融合模型,对视频、音频特征进行特征级融合。实验结果表明,经过图像可听化处理后视觉特征包含了一定的语音信息,在噪声环境下的识别效果比较稳定,神经网络的融合模型提高了系统的鲁棒性。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号