Archives of human rights violations reports, by virtue of their poor metadata, basis in natural language, and scale, obscure fine grain analyses of violation event patterns. Cross-document coreference of victim or perpetrator occurrences from across a corpus is challenging, particularly when those mentions relate to different events. These challenges are emblematic of the transition from small scale to big data analysis in the humanities. This paper discusses these issues and proposes a framework to address these challenges so as to explore narrative construction and the formation of collective memory. Though our framework is based on processing human rights violation reports, it can be readily extended to support other big data problems in the humanities.
展开▼