In an embodiment, a method for representing a surrounding environment of an ego autonomous driving vehicle (ADV) is described. The method represents the surrounding environment using a first set of features from a definition (HD) map and a second set of features from a target object in the surrounding environment. The first set of features are extracted from the high definition map using a convolutional neural network (CNN), and the second set of features are handcrafted features from the target object during a predetermined number of past driving cycles of the ego ADV. The first set of features and the second set of features are concatenated and provided to a number of fully connected layers of the CNN to predict behaviors of the target object. In one embodiment, the operations in the method can be repeated for each driving cycle of the ego ADV.
展开▼