Digging into Hadoop-based Big Data Architectures
During the last decade, the notion of big data invades the field of information technology. This reflects the common reality that organizations have to deal with huge masses of information that need to be treated and processed, which represents a strong commercial and marketing challenge. The analysis and collection of Big Data have brought about solutions that combine traditional data warehouse technologies with the systems of Big Data in a logical and coherent structure. Thus, many vendors offer their own Hadoop distributions such as HortonWorks, Cloudera, MapR, IBM Infosphere BigInsights, Pivotal HD, Microsoft HD Insight, and so on. Their main purpose was to supply companies with a complete, stable and secure Hadoop solution for Big Data. They even compete with each others to find efficient and complete solutions to satisfy their customers need and, hence, make benefit from this fast-growing market. In this article, we shall present a comparative study in which we shall use 34 relevant criteria to determine the advantages and drawbacks of the most outstanding Hadoop distribution providers.
Keywords: Big Data, Big Data distributions, Hadoop Architectures, comparison, Big Data solutions comparison.
Download Full-Text
ABOUT THE AUTHORS
Allae Erraissi
Ph.D. student of computer science at the Faculty of Sciences at the Hassan II University, Casablanca, Morocco. He won his Master Degree in Information Sciences and Engineering from the same University in 2016 and is currently working as Mathematics teacher in a High school in Casablanca, Morocco. His main interests are the new technologies namely Model-driven engineering, Cloud Computing, and Big Data.
Abdessamad Belangour
Associate Professor at the Faculty of Sciences at the Hassan II University, Casablanca, Morocco. He is mainly working on Model Driven Engineering approaches and their applications on new emerging technologies such as Big Data, Business Intelligence, Cloud Computing, Internet of Things, Real-time embedded systems etc.
Abderrahim Tragha
Full Professor at the Faculty of Sciences at the Hassan II University, Casablanca, Morocco. He is specialized in cryptography and is recently interested in Automatic Language Processing applied on The Arabic language, in Model-driven engineering, and in Big Data.
Allae Erraissi
Ph.D. student of computer science at the Faculty of Sciences at the Hassan II University, Casablanca, Morocco. He won his Master Degree in Information Sciences and Engineering from the same University in 2016 and is currently working as Mathematics teacher in a High school in Casablanca, Morocco. His main interests are the new technologies namely Model-driven engineering, Cloud Computing, and Big Data.
Abdessamad Belangour
Associate Professor at the Faculty of Sciences at the Hassan II University, Casablanca, Morocco. He is mainly working on Model Driven Engineering approaches and their applications on new emerging technologies such as Big Data, Business Intelligence, Cloud Computing, Internet of Things, Real-time embedded systems etc.
Abderrahim Tragha
Full Professor at the Faculty of Sciences at the Hassan II University, Casablanca, Morocco. He is specialized in cryptography and is recently interested in Automatic Language Processing applied on The Arabic language, in Model-driven engineering, and in Big Data.