Please login first
Design and Implementation of NLP-based Spell Checker for the Tamil Language
* , ,
1  Virtuous Transactional Analytics Pvt. Ltd., Noida 201309, India

Abstract:

A spell checker is a tool used for analyzing and validating spelling mistakes in the text. Recently, the role of a spell checker has diversified, and it is also used to suggest possible corrections to the detected spelling mistakes. Tamil is one of the oldest surviving and international spoken languages of the world, and it is grammatically very rich. Grammar is vital for effective communication and information transmission. However, learning the language rules and the old teaching methodology becomes a challenge for the researchers. The amalgamation of computer and language using natural language processing (NLP) provides a solution to this problem. In this paper, an advanced NLP technique is used to detect wrongly spelled words in the Tamil language text, and to provide possible correct word suggestions and the probability of occurrence of each word in the corpus. The proposed model recommends correct suggestions for the misspelled words using the minimum edit distance (MED) algorithm, which is customized for the Tamil vocabulary. A distance matrix is created between the misspelled word and all possible permutations of the word. Dynamic programming is used for calculating the least possible changes needed to correct the misspelled words, and suggesting the most appropriate words as the corrections.

Keywords: automatic spelling suggestion; dynamic programming; minimum edit distance (MED); natural language processing (NLP); Tamil spell checker
Top