Efficient multiview video encoding is realized even in a situation in which a processing picture cannot be obtained, by accurately estimating a motion vector and simultaneously using an inter-camera correlation and a temporal correlation in prediction of a video signal. A view synthesized picture at a time when a processing picture has been taken is generated from a reference camera video that has been taken by a camera different from a processing camera that has taken the processing picture included in a multiview video based on the same setting as that of the processing camera. A motion vector is estimated by searching for a corresponding region in a reference picture taken by the processing camera using a picture signal on the view synthesized picture corresponding to a processing region on the processing picture without using the processing picture.
展开▼