Several studies focused on single human activity recognition, while the classification of group activities is still under-investigated. In this paper, we present an approach for classifying the activity performed by a group of people during daily life tasks at work. We address the problem in a hierarchical way by first examining individual person actions, reconstructed from data coming from wearable and ambient sensors. We then observe if common temporal/spatial dynamics exist at the level of group activity. We deployed a Multimodal Deep Learning Network, where the term multimodal is not intended to separately elaborate the considered different input modalities, but refers to the possibility of extracting activity-related features for each group member, and then merge them through shared levels. We evaluated the proposed approach in a laboratory environment, where the employees are monitored during their normal activities. The experimental results demonstrate the effectiveness of the proposed model with respect to an SVM benchmark.
展开▼