In this paper, we augment the Boundary Annotated Qur’an dataset published at LREC 2012 (Brierley et al 2012; Sawalha et al 2012a) with automatically generated phonemic transcriptions of Arabic words. We have developed and evaluated a comprehensive grapheme-phoneme mapping from Standard Arabic > IPA (Brierley et al under review), and implemented the mapping in Arabic transcription technology which achieves 100% accuracy as measured against two gold standards: one for Qur’anic or Classical Arabic, and one for Modern Standard Arabic (Sawalha et al [1]). Our mapping algorithm has also been used to generate a pronunciation guide for a subset of Qur’anic words with heightened prosody (Brierley et al 2014). This is funded research under the EPSRC " Working Together" theme.
展开▼