International Journal of Computer Science Issues

Parallel and Scalable Map Reduce and Pipeline Tree Classifiers for Massive Dataset Using Map Reduce and Data Flow Pipeline

A. M. James Raj, J. Prema, P. Xavier and F. Sagayaraj Francis

One of the important research areas in todays scenario is classification of Big Data. While there are a lot of traditional classification methods, extending them to Big Data is quite challenging. Decision Tree Classifier is one of the effective traditional classification techniques. The combination of Hadoop and Map Reduce has been adapted by many researchers both commercially and academically to process Big Data. Of late, Google cloud dataflow paradigm has sneaked into the Big Data scenario that augments the earlier systems with stream processing. This paper presents two algorithms based on Map Reduce and Google cloud data flow for implementing decision trees for classification is presented. The performances of both algorithms on various parameters have been compared and presented.

Keywords: Decision Tree, Hadoop Distributed File System,Map Reduce Classifier, Pipeline Tree Classifier, Google Dataflow

Download Full-Text

ABOUT THE AUTHORS

A. M. James Raj
A.M.James Raj, received M.Sc. in Computer Science from Bharathidasan University,Trichy and M.Phil from Alagappa Univeristy, Karaikudi from Tamil Nadu, India and M.Tech in information Technology from AAI–DU Allahabad, India. He also cleared National Eligibility Test (NET), a qualifying examination for college/university professors, conducted by central Government of India. Presently he is working as an associate professor in computer science and applications in Pope John Paul II College of Education,affiliated to Pondicherry University. His research interest includes in Data Mining, in particular Web mining and Data Bases

J. Prema
J Prema, received B.Tech. and M.Tech. degrees from Pondicherry University and currently working at Tata Consultancy Services, Chennai.

P. Xavier
P. Xavier obtained his Ph.D. degree from Sri Chandrasekharendra Saraswathi Viswa Mahavidyalaya University, Kanchipuram and is currently working as a Professor of Computer Applications at Sacred Heart College, Tirupattur, Tamil Nadu, India.

F. Sagayaraj Francis
F. Sagayaraj Francis obtained his Ph.D. degree from Pondicherry University and is currently working as a Professor of Computer Science and Engineering, Pondicherry Engineering College, Pondicherry, India

International Journal of Computer Science Issues More than a traditional journal...

Parallel and Scalable Map Reduce and Pipeline Tree Classifiers for Massive Dataset Using Map Reduce and Data Flow Pipeline

International Journal of Computer Science Issues

More than a traditional journal...