首页>
外文期刊>digital scholarship in the humanities
>An analysis of the relationship between cohesion and clause combination in English discourse employing NLP and data mining approaches
【24h】
An analysis of the relationship between cohesion and clause combination in English discourse employing NLP and data mining approaches
This study examines the relationship between the frequencies of clause combination and the distribution of discourse-pragmatic markers of cohesion in a sub-sample of the Susanne corpus. It addresses the theory that clause grammar constitutes a form of grammar-cued discourse coherence which functions as an integrated system with other methods of managing coherence in language. Evidence is sought for whether increased clause density in a corpus correlates with a reduction in explicit cohesive devices. To address this, a computational approach is outlined for the coding of cohesion in a corpus, using a semi-automated data mining procedure. To validate this approach, it is compared with cohesion measures on the same data using the NLP tool Coh-Metrix 3.0. The two approaches are shown to positively correlate on a series of measures, suggesting they significantly overlap in quantifying the cohesion construct. The final analysis of the tagged corpus indicates that as frequencies of clause combination increase in a text, the use of explicit lexical cohesive devices decrease. Also, higher frequencies of clause combination positively correlate with an increased use of grammatical cohesive devices. Findings are interpreted as generally aligning with the expectations of the theoretical framework known as the Adaptive Approach to Grammar.
展开▼