International Journal of Computer Science Issues

A Study On Document Classification Using Machine Learning Techniques

Kabita Thaoroijam

With the explosion of information fuelled by the growth of the World Wide Web it is no longer feasible for a human observer to understand all the data coming in or even classify it into categories. With this growth of information and simultaneous growth of available computing power automatic classification of data, particularly textual data, gains increasingly high importance. Text classification is a task of automatically sorting a set of documents into categories from a predefined set and is one of the important research issues in the field of text mining. This paper provides a review of generic text classification process, phases of that process and methods being used at each phase.

Keywords: Machine learning algorithm, document representation, classification, performance evaluation

Download Full-Text

ABOUT THE AUTHOR

Kabita Thaoroijam
Kabita Thaoroijam is an Assistant Professor in the department of Computer Applications, Haldia Institute of Technology in West Bengal,India. Her research interest is in the area of Text Mining and Natural Language Processing. She has also worked in various NLP projects.

International Journal of Computer Science Issues More than a traditional journal...

A Study On Document Classification Using Machine Learning Techniques

International Journal of Computer Science Issues

More than a traditional journal...