preview

Features Of Hadoop Software Library

Better Essays

TABLE 1. FEATURES OF HADOOP FRAMEWORK Scalability Allows hardware infrastructure to scale up and down with no need to change data formats Cost Efficiency Massively parallel computation leads to a sizeable decrease in cost Flexibility Hadoop is schema free so handles many challenges of big data Fault Tolerance Recovery of data and computation failure B. Hadoop Software Library The massive computing library of Hadoop consists of several modules including HDFS, Hive, HBase, Pig, and Map Reduce. Fig 2: Architecture of Hadoop Software Library The different modules in the architecture of Hadoop are introduced below. Apache Flume and Sqoop are the two data integration tools that do the task of data Acquisition. Efficient collection of data from different sources and storing them to a centralized store is the main work of Flume and Sqoop. HDFS(Hadoop Distributed File System) runs on commodity hardware that refers to Google File system(GFS).HDFS consists of one Name Node that manages the file system metadata and many Data Nodes that stores the actual data. HBase is a column-oriented store which provides capabilities like Google Big Table. The input and output to the Hadoop Map Reduce can be served by HBase. Map Reduce is

Get Access