首页> 外文期刊>Journal of vision >Turn that frown upside-down! Inferring facial actions from pairs of images in a neurally plausible computational model
【24h】

Turn that frown upside-down! Inferring facial actions from pairs of images in a neurally plausible computational model

机译:把皱眉倒过来!从神经对似的计算模型中从图像对推断面部动作

获取原文
获取外文期刊封面目录资料

摘要

Most approaches to image recognition focus on the problem of inferring a categorical label or action code from a static image, ignoring dynamic aspects of appearance that may be critical to perception. Even methods that examine behavior over time, such as in a video sequence, tend to label each image frame independently, ignoring frame-to-frame dynamics. This viewpoint suggests that it is time-independent categorical information that is important, and not the patterns of actions that relate stimulus configurations together across time. The current work focuses on face perception and demonstrates that there is important information that can be extracted from pairs of images by examining how the face transforms in appearance from one image to another. Using a biologically plausible neural network model called a conditional Restricted Boltzmann Machine that performs unsupervised Hebbian learning, we show that the network can infer various facial actions from a sequence of images (e.g., transforming a frown into a smile or moving the face from one location of the image frame to another). Critically, after inferring the actions relating two face images from one individual, the network can apply the transformation to a test face from an unknown individual, without any knowledge of facial identity, expressions, or muscle movements. By visualizing the factors that encode and break down facial actions into a distributed representation, we demonstrate a kind of factorial action code that the network learns in an unsupervised manner to separate identity characteristics from rigid (affine) and non-rigid expression transformations. Models of this sort suggest that neural representations of action can factor out specific information about a face or object such as its identity that remain constant from its dynamic behavior, both of which are important aspects of perceptual inference.
机译:图像识别的大多数方法都集中在从静态图像推断分类标签或动作代码的问题上,而忽略了可能对感知至关重要的外观动态方面。甚至检查随时间变化的行为(例如在视频序列中)的方法也倾向于独立标记每个图像帧,而忽略帧到帧的动态。这种观点表明,重要的是与时间无关的分类信息,而不是将刺激配置跨时间联系在一起的动作模式。当前的工作集中在面部感知上,并表明可以通过检查面部如何从一个图像转换为另一个图像来从图像对中提取重要信息。使用称为条件受限玻尔兹曼机器的生物学上可行的神经网络模型执行无监督的Hebbian学习,我们表明该网络可以从一系列图像中推断出各种面部动作(例如,将皱眉变成微笑或从一个位置移动脸部)图像帧到另一个)。至关重要的是,在从一个人推断出与两张脸部图像有关的动作之后,网络可以将转换应用于来自未知人的测试脸部,而无需任何面部识别,表情或肌肉运动的知识。通过可视化将面部动作编码并将其分解为分布式表示的因素,我们演示了网络以无监督方式学习的一种析因动作代码,以将身份特征与刚性(仿射)和非刚性表达转换分开。这种模型表明,动作的神经表示可以排除关于面孔或物体的特定信息(例如其身份),这些信息从其动态行为中保持不变,这两者都是感知推断的重要方面。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号