In one configuration, a visual object tracking apparatus is provided that receives a position of an object in a first frame of a video, and determines a current position of the object in subsequent frames of the video using a Siamese neural network To facilitate determining the current position of the object, the apparatus may adjust a spatial resolution of an image, adjust a size of a probe region, and/or adjust a scale of a plurality of sampled images. In one configuration, a visual object tracking using a Siamese neural network is provided. The apparatus feeds outputs from a plurality of subnetworks of the Siamese neural network to a comparison layer. In addition, the apparatus compares, at the comparison layer, inputs from the plurality of subnetworks to generate a comparison result. Further, the apparatus combines comparison results based on weights to obtain a final comparison result.
展开▼