Verification and Validation of MapReduce Program Model for Parallel Support Vector Machine Algorithm on Hadoop Cluster
We currently live in the data age. Its not easy to measure the total volume of structured and unstructured data that require machine-based systems and technologies in order to be fully analyzed. Efficient implementation techniques are the key to meeting the scalability and performance requirements entailed in such scientific data analysis. So for the same in this paper the Sequential Support Vector Machine in WEKA and various MapReduce Programs including Parallel Support Vector Machine on Hadoop cluster is analyzed and thus, in this way Algorithms are Verified and Validated on Hadoop Cluster using the Concept of MapReduce. In this paper, the performance of above applications has been shown with respect to execution time/training time and number of nodes. Experimental Results shows that as the number of nodes increases the execution time decreases. This paper is basically a research study of above MapReduce applications.
Keywords: Machine Learning, SVM, LIBSVM, WEKA Tool, MultiFileWordCount, PiEstimator, Parallel SVM, Hadoop, MapReduce
Download Full-Text
ABOUT THE AUTHORS
Kiran M.
M.Tech - Department of Computer Science and Engineering, Christ University Faculty of Engineering
Amresh Kumar
M.Tech - Department of Computer Science and Engineering, Christ University Faculty of Engineering
Saikat Mukherjee
Senior Software Engineer,Hewlett-Packard(HP)
Ravi Prakash G.
Professor,Department of Computer Science and Engineering, Christ University Faculty of Engineering
Kiran M.
M.Tech - Department of Computer Science and Engineering, Christ University Faculty of Engineering
Amresh Kumar
M.Tech - Department of Computer Science and Engineering, Christ University Faculty of Engineering
Saikat Mukherjee
Senior Software Engineer,Hewlett-Packard(HP)
Ravi Prakash G.
Professor,Department of Computer Science and Engineering, Christ University Faculty of Engineering