首页> 外文会议>IEEE/RSJ International Conference on Intelligent Robots and Systems >Analyzing Liquid Pouring Sequences via Audio-Visual Neural Networks
【24h】

Analyzing Liquid Pouring Sequences via Audio-Visual Neural Networks

机译:通过视听神经网络分析液体浇注序列

获取原文

摘要

Existing work to estimate the weight of a liquid poured into a target container often require predefined source weights or visual data. We present novel audio-based and audio-augmented techniques, in the form of multimodal convolutional neural networks (CNNs), to estimate poured weight, perform overflow detection, and classify liquid and target container. Our audio-based neural network uses the sound from a pouring sequence-a liquid being poured into a target container. Audio inputs consist of converting raw audio into mel-scaled spectrograms. Our audio-augmented network fuses this audio with its corresponding visual data based on video images. Only a microphone and camera are required, which can be found in any modern smartphone or Microsoft Kinect. Our approach improves classification accuracy for different environments, containers, and contents of the robot pouring task. Our Pouring Sequence Neural Networks (PSNN) are trained and tested using the Rethink Robotics Baxter Research Robot. To the best of our knowledge, this is the first use of audio-visual neural networks to analyze liquid pouring sequences by classifying their weight, liquid, and receiving container.
机译:估计倒入目标容器中的液体的重量的现有工作通常需要预定义的源权重或视觉数据。我们提出了新颖的基于音频和音频增强技术,以多模式卷积神经网络(CNNS)的形式,以估计倾倒的重量,执行溢出检测,以及分类液体和目标容器。我们的音频基神经网络使用从浇注序列的声音 - 液体倒入目标容器中。音频输入包括将原始音频转换为熔化谱图。我们的音频网络通过基于视频图像使用其相应的可视数据熔化此音频。只需要麦克风和相机,可以在任何现代智能手机或Microsoft Kinect中找到。我们的方法提高了机器人倾倒任务的不同环境,容器和内容的分类准确性。我们的浇注序列神经网络(PSNN)使用Rethink机器人Baxter研究机器人进行培训和测试。据我们所知,这是首次使用视听神经网络来分析其重量,液体和接收容器来分析液体浇注序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号