The work presented in this paper concerns discourse structure analysis and its applications to intra- and inter-document search. In a typical application, which could be called “rhetorical browsing”, the system will provide assistance to a journal reader in order to focus on texts and passages presenting certain kind of information and comments, according to his/her current interest: may be raw information, possibly with chronological dimension, or on contrary analyses, recommendations, debates, etc‥ The discourse model can be related to Swales's “discourse moves” and the derived “argumentative zoning” procedures for scientific documents. However due to the nature of the considered texts, zones are defined in more “generalist” terms, following the classic Narration-Description-Argumentation-Prescription typology and especially C. Smith's notion of “discourse modes”. The paper presents some preliminary steps performed in order to test the feasibility of the project. First of all, in order to ground our research on firm observations, we decided to build a corpus of journalistic texts, annotated according to the discourse model in view. Quantified results concerning the organization of discourse modes within texts could be obtained thanks to these annotations. In a second step, an experimental procedure for automatic tagging of text passages according to discourse modes has been designed, implemented and tested on the corpus.
展开▼