Step by Step Tutorial to Deploy Hadoop Cluster (fully
distributed mode):
To setup Hadoop in cluster (distributed cluster) requires multiple machines/nodes, one
node will act as master and rest all will act as slaves.
If you want Hadoop quick introduction please click here.
If you want to setup hadoop in pseudo distributed mode
please click here
In this tutorial:
- I am using 3 nodes, 1 master 2 slaves
- I am using Cloudera distribution for Apache hadoop CDH3U3 (you can use Apache hadoop (0.20.X) also)
- I am deploying hadoop on ubuntu (you can use other OS (cent OS, Redhat, etc))
Install / Setup
Hadoop on cluster
Install Hadoop on
master:
1. Add entry of
master and slaves in hosts file:
Edit hosts file and following add entries
$ sudo pico /etc/hosts
MASTER-IP master
SLAVE01-IP slave01
SLAVE02-IP slave02
(In place of MASTER-IP, SLAVE01-IP, SLAVE02-IP put the
value of corresponding IP)