Friday 19th of April 2024
 

Keyword Reduction for Text Categorization using Neighborhood Rough Sets


Si-Yuan Jing

Keyword reduction is a technique that removes some less important keywords from the original dataset. Its aim is to decrease the training time of a learning machine and improve the performance of text categorization. Some researchers applied rough sets, which is a popular computational intelligent tool, to reduce keywords. However, classical rough sets model, which is usually adopted, can just deal with nominal value. In this work, we try to apply neighborhood rough sets to solve the keyword reduction problem. A heuristic algorithm is proposed meanwhile compared with some classical methods, such as Information Gain, Mutual Information, CHI square statistics, etc. The experimental results show that the proposed methods can outperform other methods.

Keywords: Text Categorization; Keyword Reduction; Neighborhood Rough Sets; Heuristic Algorithm

Download Full-Text


ABOUT THE AUTHOR

Si-Yuan Jing
Sichuan Province University Key Laboratory of Internet Natural Language Intelligent Processing, Leshan Normal University


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »