首页> 外文会议>5th Joint Symposium on Neural Computation Vol.8 May 16, 1998 San Diego, CA >Reading english test with spatially-invariant recptive fields: when the 'binding problem' isn't
【24h】

Reading english test with spatially-invariant recptive fields: when the 'binding problem' isn't

机译:阅读具有空间不变接收域的英语测试:当“绑定问题”不是

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Accumulating experimental evidence suggests that 91) recognition of complex objects and scenes can sometimes occur in a single feedforward pass through the visual system (Fize et al., 1998), and (2) the visual field may be encoded during this first pass in terms of a population of spatially-invariant detectors of visual mini-patterns (Kobatake and Tanaka, 1994). We study here some of the computational properties of visual representations based on spatially-invariant receptive fields (RF's), using the domain of text as a convenient surrogate for visual recognition in general. We begin by developing an analytical model that makes explicit how recognition performance is affected by (1) the number of object categories (workds) that must be distinguished, (2) the complexity of individual objects (length of workds), (3) the number and order of binding of elemental features (letters) included in the representation, and (4) the clutter load, i.e. the amount of visual material (text) in the field of view in which multiple objects must be recognized with-out explicit segmentation. We show that that the model achieves good fits to recognition rates for English text over a wide range of clutter loads, word sizes, and feature counts. We then show, using a quasi-supervised greedy algorithm for feature learning, that fewer than 1,500 mostly low-order, spatially-invariant letter-tuple detectors (akin to Wichelfeatures) are needed to unambiguously represent all the workds simultaneously present in randomly chosen windown tof text up to 50 characters in width. Our results help explain how, and under what conditions, spatially-invariant RF-based representations can process multiple objects simultaneously witout explicit segmentation processes, and they lend support to the notion that representations of this simple kind may underlie important aspects of primate/human recognition.
机译:越来越多的实验证据表明,91)对复杂物体和场景的识别有时可能会在通过视觉系统的单次前馈过程中发生(Fize et al。,1998),并且(2)可以在第一次通过过程中将视野编码为视觉迷你图案的空间不变检测器的数量(Kobatake和Tanaka,1994)。我们在这里研究基于空间不变接受场(RF's)的视觉表示的一些计算属性,通常使用文本域作为视觉识别的便捷替代。我们首先开发一个分析模型,该模型明确表明识别性能如何受到以下因素的影响:(1)必须区分的对象类别(工作对象)的数量;(2)单个对象的复杂性(工作对象的长度);(3)表示中包含的基本特征(字母)的绑定数量和顺序,以及(4)杂乱的负荷,即必须识别多个对象而没有显式分割的视场中的可视材料(文本)的数量。我们表明,该模型在各种杂波负载,单词大小和特征计数范围内都能很好地拟合英语文本的识别率。然后,我们使用准监督贪婪算法进行特征学习,表明需要少于1,500个大多数为低阶,空间不变的字母元组检测器(类似于Wichelfeatures)来明确表示同时出现在随机选择的windown中的所有工作tof文字,最多50个字符。我们的结果有助于说明在不采用显式分割过程的情况下,基于空间不变的基于RF的表示如何以及在何种条件下可以同时处理多个对象,并且它们支持这种简单类型的表示可能是灵长类/人识别的重要方面的基础的观点。 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号