Real-time scene understanding system employing an object detection module with an algorithm for localization and classification of objects in an image, and a semantic segmentation module with an algorithm for classification of individual pixels in the image, wherein the system comprises an encoder module operable on an input image for the extraction of notable features in the input image, one or more attention modules to attribute among the notable features in the input image as provided by the encoder a relative contribution of each of such notable features in an output image to be reconstructed from the input image, and a decoder module for reconstructing the output image using the notable features, wherein the reconstructed output image is made available to the object detection module with the algorithm for localization and classification of objects in the image, and to the semantic segmentation module with the algorithm for classification of individual pixels in the image.
展开▼