A video data-based fraud detection method, comprising: obtaining video data requiring detection; extracting image data of video frames from the video data requiring detection, and dividing the image data into multiple image data sets according to the time sequence of the video frames, each image data set comprising image data corresponding to consecutive video frames; inputting each image data set into a pre-trained image feature extraction model to obtain an image feature vector; extracting voice data from the video data requiring detection, and obtaining voice feature vectors of the voice data; concatenating the image feature vectors and the voice feature vectors to obtain a multi-modal feature vector; and inputting the multi-modal feature vector into a pre-trained fraud detection model to obtain a fraud detection result corresponding to the video data requiring detection output by the fraud detection model.
展开▼