首页> 美国卫生研究院文献>Proceedings of the National Academy of Sciences of the United States of America >PNAS Plus: Brain-inspired automated visual object discovery and detection
【2h】

PNAS Plus: Brain-inspired automated visual object discovery and detection

机译:PNAS Plus:灵感来自大脑的自动视觉对象发现和检测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Despite significant recent progress, machine vision systems lag considerably behind their biological counterparts in performance, scalability, and robustness. A distinctive hallmark of the brain is its ability to automatically discover and model objects, at multiscale resolutions, from repeated exposures to unlabeled contextual data and then to be able to robustly detect the learned objects under various nonideal circumstances, such as partial occlusion and different view angles. Replication of such capabilities in a machine would require three key ingredients: (i) access to large-scale perceptual data of the kind that humans experience, (ii) flexible representations of objects, and (iii) an efficient unsupervised learning algorithm. The Internet fortunately provides unprecedented access to vast amounts of visual data. This paper leverages the availability of such data to develop a scalable framework for unsupervised learning of object prototypes—brain-inspired flexible, scale, and shift invariant representations of deformable objects (e.g., humans, motorcycles, cars, airplanes) comprised of parts, their different configurations and views, and their spatial relationships. Computationally, the object prototypes are represented as geometric associative networks using probabilistic constructs such as Markov random fields. We apply our framework to various datasets and show that our approach is computationally scalable and can construct accurate and operational part-aware object models much more efficiently than in much of the recent computer vision literature. We also present efficient algorithms for detection and localization in new scenes of objects and their partial views.
机译:尽管最近取得了重大进展,但机器视觉系统在性能,可伸缩性和鲁棒性方面仍大大落后于其生物学同类产品。大脑的显着特征是它能够以多尺度的分辨率自动发现和建模对象,从重复曝光到未标记的上下文数据,然后能够在各种非理想情况下(例如部分遮挡和不同的视图)稳健地检测到学习到的对象。角度。在机器中复制此类功能将需要三个关键要素:(i)访问人类体验的那种大规模感知数据;(ii)灵活地表示对象;以及(iii)有效的无监督学习算法。幸运的是,Internet提供了前所未有的对大量可视数据的访问。本文利用这些数据的可用性,为对象原型的无监督学习开发了可扩展的框架-大脑启发的可变形对象(例如,人类,摩托车,汽车,飞机)的柔性,比例和位移不变表示,包括零件,零件,不同的配置和视图及其空间关系。在计算上,对象原型使用概率构造(例如马尔可夫随机场)表示为几何关联网络。我们将我们的框架应用于各种数据集,并表明我们的方法在计算上是可扩展的,并且比许多最新的计算机视觉文献中的方法可以更有效地构建准确且可操作的零件感知对象模型。我们还提出了有效的算法,用于在物体及其局部视图的新场景中进行检测和定位。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号