We seek to develop a robot that will be capable of teaming with humans to accomplish physical exploration tasks that would not otherwise be possible in dynamic, dangerous environments. For such tasks, a human commander needs to be able to communicate with a robot that moves out of sight and relays information back to the commander. What is the best way to determine how a human commander would interact in a multi-modal spoken dialog with such a robot to accomplish tasks? In this paper, we describe our initial approach to discovering a principled basis for coordinating turn-taking, perception, and navigational behavior of a robot in communication with a commander, by identifying decision phases in dialogs collected in a WoZ framework. We present two types of utterance annotation with examples applied to task-oriented dialog between a human commander and a human "robot navigator" who controls the physical robot in a realistic environment similar to expected actual conditions. We discuss core robot capabilities that bear on the robot navigator's ability to take turns while performing a "find the building doors" task at hand. The paper concludes with a brief overview of ongoing work to implement these decision phases within an open-source dialog management framework, constructing a task tree specification and dialog control logic for our application domain.
展开▼