Friday 23rd of February 2018

A Word Matching Algorithm in Handwritten Arabic Recognition Using Multiple-Sequence Weighted Edit Distances

Gheith A. Abandah and Fuad T. Jamour

No satisfactory solutions are yet available for the offline recognition of handwritten cursive words, including the words of Arabic text. Word matching algorithms can greatly improve the OCR output when recognizing words of known and limited vocabulary. This paper describes the word matching algorithm used in the JU-OCR2 optical character recognition system of handwritten Arabic words. This system achieves state-of-the-art accuracy through multiple techniques including an efficient word matching algorithm. This algorithm reduces the average sequence error for the IfN/ENIT database of handwritten Arabic words from 32.3% to an average word error of just 5.0%. This algorithm is a weighted version of the edit distance algorithm. The weighted version has a 5.0% advantage over the plain edit distance algorithm. This algorithm selects the best match utilizing a set of multiple probable sequences from the sequence transcription stage. Using multiple sequences, instead of one, reduces the average error by 27.0% over the weighted edit distance algorithm. Compared with an algorithm used in a leading system, this algorithm offers 6.7% lower average word error for the main two test sets.

Keywords: Optical Character Recognition, Handwritten Arabic Words, Word Matching, Edit Distance.

Download Full-Text


Gheith A. Abandah
Gheith A. Abandah has received MSE and PhD in Computer Science and Engineering from the University of Michigan in 1995 and 1998. He has been responsible for several funded projects in the areas of Arabic text recognition and electronic voting. He has more than 15 years of industrial experience in localization, military electronics, product and project development and deployment. He is an associate professor with the University of Jordan.

Fuad T. Jamour
Fuad T. Jamour received a BS degree in Computer Engineering from the University of Jordan in 2011 with top honors. He has worked on handwritten Arabic word recognition as an undergraduate research assistant and also in his BS project. He is currently a graduate student in King Abdullah University of Science and Technology (KAUST), and his interests include data management, cloud computing, and pattern recognition.

IJCSI Published Papers Indexed By:





IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482

More contact details »