In this paper we describe a system that integrates image processing and natural language processing for tasks that involve communicating visual information. The system determines information about the spatial relationship of objects in images and conveys it in the form of an English sentence. We are exploring the applicability of this system to two tasks: landmark navigation and the generation of descriptions of abnormal densities in radiographs. Our previous work described a computational model of preposition semantics and a method for handling some of the ambiguities associated with natural language. Here we concentrate on generating optimal locative expressions for object pairs. In describing the system we will explain the methodologies it employs to achieve its goals. We will illustrate the system's use of these methodologies through several examples for each task.
展开▼