PROBLEM TO BE SOLVED: To provide a frameworking method and apparatus for detecting violence from video. A violence detection frameworking method includes a first step, 2D, of extracting a two-dimensional (2D)-based luminance component Y image by excluding color difference components U and V from one frame image included in an input image. The second stage and the 3D-based Y image group in which the base Y-images are sequentially accumulated in the 3D (3D) and only the frames with equal intervals are extracted to obtain the 3D (3D)-based Y image group. By including a third step of deriving a scene of violence detection by performing a video convolution using a 3*3*3 filter, the network is optimized for weight and time-space optimized video. It is made and applied to the algorithm, and the characteristic points of violence are continuously memorized and re-learned in a specific layer during the image convolution process. [Selection diagram] Figure 4
展开▼