When a scene group including plural scenes, for example a group of scenes bookmarked during viewing of video contents is inputted, a combination of a scene and metadata in a group of metadata that represents characteristics of the scene, corresponding to the respective scenes in the scene group, which combination has a largest distance between the metadata, is selected as explanatory descriptions that are explanations for distinguishing among the scenes, and the selected explanatory descriptions for each scene included in the scene group is added to each scene.
展开▼