This paper deals with the development of spellchecker in Indian Languages with an example in Bangla, the second most popular language in Indian Subcontinent. A brief review of problems and current scenario of Indian language spell-checkers is described. Then the approach on Bangla spell-checker is elaborated. In this approach the technique works in two stages. The first stage takes care of phonetic similarity error. For that the phonetically similar characters are mapped into single units of character code. A new dictionary D{sub}c is constructed with this reduced set of alphabet A phonetically similar but wrongly spelt word can be easily corrected using this dictionary. The second stage takes care of errors other than phonetic similarity. Here wrongly spelt word S of n characters is searched in the dictionary D{sub}c. If S is a nonword, its first k{sub}1≤n characters will match with a valid word in D{sub}c. (if k{sub}1=n then the word in D{sub}c must be longer than n). A reversed word dictionary D{sub}r is also generated where the characters of the word are maintained in a reversed order. If the last k{sub}2 characters of S match with a word in D{sub}r then, for single error, it is located within the intersection region of first k{sub}1+1 and last k{sub}2 +1 characters of S. We observed that this region is very small compared to word length for most cases and the number of suggested correct words can be drastically reduced using this information. We have used our approach in correcting Bangla text, where the problem of inflection is tackled by a simplified version of morphological analyser. Another problem encountered in Indian languages is the existence of large number of compound words formed by Euphony and Assimilation. The problem of compound words is also carefully tackled.
展开▼