A Study On Document Classification Using Machine Learning Techniques
With the explosion of information fuelled by the growth of the World Wide Web it is no longer feasible for a human observer to understand all the data coming in or even classify it into categories. With this growth of information and simultaneous growth of available computing power automatic classification of data, particularly textual data, gains increasingly high importance. Text classification is a task of automatically sorting a set of documents into categories from a predefined set and is one of the important research issues in the field of text mining. This paper provides a review of generic text classification process, phases of that process and methods being used at each phase.
Keywords: Machine learning algorithm, document representation, classification, performance evaluation
Download Full-Text
ABOUT THE AUTHOR
Kabita Thaoroijam
Kabita Thaoroijam is an Assistant Professor in the department of Computer Applications, Haldia Institute of Technology in West Bengal,India. Her research interest is in the area of Text Mining and Natural Language Processing. She has also worked in various NLP projects.
Kabita Thaoroijam
Kabita Thaoroijam is an Assistant Professor in the department of Computer Applications, Haldia Institute of Technology in West Bengal,India. Her research interest is in the area of Text Mining and Natural Language Processing. She has also worked in various NLP projects.