Friday 26th of April 2024
 

FST Based Morphological Analyzer for Hindi Language


Deepak Kumar, Manjeet Singh and Seema Shukla

Hindi being a highly inflectional language, FST (Finite State Transducer) based approach is most efficient for developing a morphological analyzer for this language. The work presented in this paper uses the SFST (Stuttgart Finite State Transducer) tool for generating the FST. A lexicon of root words is created. Rules are then added for generating inflectional and derivational words from these root words. The Morph Analyzer developed was used in a Part Of Speech (POS) Tagger based on Stanford POS Tagger. The system was first trained using a manually tagged corpus and MAXENT (Maximum Entropy) approach of Stanford POS tagger was then used for tagging input sentences. The morphological analyzer gives approximately 97% correct results. POS tagger gives an accuracy of approximately 87% for the sentences that have the words known to the trained model file, and 80% accuracy for the sentences that have the words unknown to the trained model file.

Keywords: Morphological Analyzer, Finite State Transducer, POS Tagger, Lexicon Generator.

Download Full-Text


ABOUT THE AUTHORS

Deepak Kumar
B.Tech(IT) Final Year Student

Manjeet Singh
B.Tech(IT) Final Year Student

Seema Shukla
Associate Professor Department of Computer Science and Engineering


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »