The more we can enlarge a parallel bilingual corpus, the more we have made it effective and powerful. Providing such corpora demands special efforts both in seeking for as much already translated texts as possible and also in designing appropriate sentence alignment algorithms with as less time complexity as possible. In this paper, we propose algorithms for sentence aligning of two Persian-English texts in linear time complexity and with a surprisingly high accuracy. This linear time-complexity is achieved through our new languageindependent anchor finding algorithm which enables us to align as a big parallel text as a whole book in a single attempt and with a high accuracy. As far as we know, this project is the first automatic construction of an English-Persian parallel sentencelevel corpus.
展开▼