The present disclosure relates to systems, methods, and non-transitory computer readable media for generating digital video summaries based on analyzing a digital video utilizing a relevancy neural network, an aesthetic neural network, and/or a generative neural network. For example, the disclosed systems can utilize an aesthetics neural network to determine aesthetics scores for frames of a digital video and a relevancy neural network to generate importance scores for frames of the digital video. Utilizing the aesthetic scores and relevancy scores, the disclosed systems can select a subset of frames and apply a generative reconstructor neural network to create a digital video reconstruction. By comparing the digital video reconstruction and the original digital video, the disclosed systems can accurately identify representative frames and flexibly generate a variety of different digital video summaries.
展开▼