PROBLEM TO BE SOLVED: To provide a speech intention model learning device that learns a model for correctly extracting a speech intention even when the speech intention oozes only in a part of a section of a speech.SOLUTION: A speech intention model is learned that is configured to: define a section in which at least one word is included and a pause between the word and word is continuous at a time interval equal to or less than a fixed time as a speech section; define a voice in the speech section as a speech; and use a local rhythm characteristic serving as a rhythm characteristic for each word section of the speech in extracting a speech intention for each accent phrase section with a local rhythm series characteristic serving as a coupled characteristic for each accent phrase section and a manually given speech intention label for each accent phrase section set as learning data.SELECTED DRAWING: Figure 6
展开▼