The paper introduces the Multimodal Multi-view Integrated Database (MMID), which holds human activities in presentation situations. MMID contains audio, video, human body motions, and transcripts, which are related to each other by their occurrence time. MMID accepts basic queries for the stored data. One can examine, by referring the retrieved data, how the different modalities are cooperatively and complementarily used in real situations. This examination over different situations is essential for understanding human behaviors, since they are heavily dependent on their contexts and personal characteristics. In this sense, MMID can serve as a basis for systematic or statistical analysis of those modalities, and it can be a good tool when one designs an intelligent user interface system or a multimedia contents handling system. The authors present the database design and its possible applications.
展开▼