Saturday 25th of November 2017
 

Person Name Recognition for Uyghur Using Conditional Random Fields


Muhtar Arkin, Abdurahim Mahmut and Askar Hamdulla

This paper describes the person name recognition system for Uyghur, a highly agglutinative language, using the conditional random fields (CRFs) approach. In this paper, our experiments with various feature combinations for Uyghur have been explained. We also described a method to build Uyghur corpus from a set of hand annotated sentences. Feature selection is an important factor in recognition of person names using CRF, we used features as like Context Words, Stems of words, Suffix and its length, whether a suffix is exist, first and last syllable of the word, POS Information, Dictionary feature etc. For evaluation, we perform several experiments using different feature settings. This model proved to have a Recall of 81.86%, Precision of 88.79% and F-score of 85.19%.

Keywords: NER, Uyghur language, person name recognition, CRF, feature

Download Full-Text


ABOUT THE AUTHORS

Muhtar Arkin
College of Information Science and Engineering, Xinjiang University, Urumqi, Xinjiang, 830046, P.R. China

Abdurahim Mahmut
College of Information Science and Engineering, Xinjiang University, Urumqi, Xinjiang, 830046, P.R. China

Askar Hamdulla
College of Software, Xinjiang University, Urumqi, Xinjiang, 830046, P.R. China


IJCSI Published Papers Indexed By:

 

 

 

 
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »