Dewy Index Based Arabic Document Classification with Synonyms Merge Feature Reduction
Feature reduction is an important process before documents classification. The classification performance is impact by the quality of the selected. A new semantic approach is presented using synonym merge to preserve features semantic and prevent important terms from being excluded. The resulting feature space were then processed with five feature selection methods, ID, TFIDF, CHI, IG and MI. experiment show that classification performance is increased after merging terms and yielding best performance for CHI and IG selection method. A promising classification technique is presented based on Dewey decimal classification system, which uses filtered indexes and three levels of classes from Dewey system to classify and label Arabic documents. The technique shows along with synonyms merge a promising result.
Keywords: Dimension reduction, Arabic text Classification, synonyms.
Download Full-Text
ABOUT THE AUTHORS
Amal Alajmi
Helwan University
Elsayed M Saad
Helwan University
Medhat H Awadalla
Helwan University
Amal Alajmi
Helwan University
Elsayed M Saad
Helwan University
Medhat H Awadalla
Helwan University