The effect of N-gram indexing on Arabic documents retrieval
This article presents a comparison between 3-gram and 4-gram term indexing in Arabic document retrieval. The calculation of similarity between query and documents is performed using single term and two term query, based on corpora of Arabic language documents collected from Arabic news websites available online.
Keywords: n-gram, Arabic text indexing, information retrieval, text similarity.
Download Full-Text
ABOUT THE AUTHOR
Emad Fawzi Al-Shalabi
Department of Information Technology, AL-BALQA Applied University, Al-Huson University College, Irbid, Al-Huson, 50, Jordan
Emad Fawzi Al-Shalabi
Department of Information Technology, AL-BALQA Applied University, Al-Huson University College, Irbid, Al-Huson, 50, Jordan