...
首页> 外文期刊>Procedia Computer Science >WWN-8: Incremental Online Stereo with Shape-from-X Using Life-Long Big Data from Multiple Modalities
【24h】

WWN-8: Incremental Online Stereo with Shape-from-X Using Life-Long Big Data from Multiple Modalities

机译:WWN-8:使用多个模式中的终身大数据,通过X形实现增量式在线立体声

获取原文
   

获取外文期刊封面封底 >>

       

摘要

When a child lives in the real world, from infancy to adulthood, his retinae receive a flood of stereo sensory stream. His muscles produce another action stream. How does the child's brain deal with such big data from multiple sensory modalities (left- and right-eye modalities) and multiple effector modalities (location, disparity map, and shape type)? This capability incrementally learns to produce simple-to-complex sensorimotor behaviors — autonomous development. We present a model that incrementally fuses such an open-ended life-long stream and updates the “brain” online so the perceived world is 3D. Traditional methods for shape- from-X use a particular type of cue X (e.g., stereo disparity, shading, etc.) to compute depths or local shapes based on a handcrafted physical model. Such a model likely results in a brit- tle system because of the fluctuation of the availability of the cue. An embodiment of the Developmental Network (DN), called Stereo Where-What Network (WWN-8), learns to per- form simultaneous attention and recognition, while developing invariances in location, disparity, shape, and surface type, so that multiple cues can automatically fill in if a particular type of cue (e.g., texture) is missing locally from the real world. We report some experiments: 1) dynamic synapse retraction and growth as a method of developing receptive fields. 2) training for recognizing 3D objects directly in cluttered natural backgrounds. 3) integration of depth perception with location and type information. The experiments used stereo images and motor actions on the order of 105 frames. Potential applications include driver assistance for road safety, mobile robots, autonomous navigation, and autonomous vision-guided manipulators.
机译:当一个孩子生活在现实世界中(从婴儿期到成年期)时,他的视网膜会收到大量的立体感觉流。他的肌肉产生另一种动作流。孩子的大脑如何处理来自多种感觉模式(左眼和右眼模式)和多种效应器模式(位置,视差图和形状类型)的大数据?此功能逐步学习以产生简单到复杂的感觉运动行为-自主开发。我们提出了一个模型,该模型逐渐融合了这种开放式的终生流,并在线更新了“大脑”,因此感知的世界是3D。用于X形状的传统方法使用特定类型的Cue X(例如,立体视差,阴影等)基于手工制作的物理模型来计算深度或局部形状。由于提示可用性的波动,这种模型可能会导致系统变硬。发育网络(DN)的一个实施例,称为立体声随处可见网络(WWN-8),学会在保持位置,视差,形状和表面类型不变的同时,进行同时的注意力和识别。如果特定类型的提示(例如纹理)在现实世界中局部缺失,则可以自动填充。我们报告了一些实验:1)动态突触的收缩和生长作为发展接受领域的一种方法。 2)直接在凌乱的自然背景下识别3D对象的训练。 3)深度感知与位置和类型信息的集成。实验使用了大约105帧的立体图像和动作。潜在的应用包括用于道路安全的驾驶员辅助,移动机器人,自主导航和自主视觉引导的操纵器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号