Monday, April 13, 2015

Install Cloudera Hadoop CDH5 with YARN on Ubuntu


This tutorial describes how to install and configure a single-node Hadoop cluster on Ubuntu OS. Single Node Hadoop cluster is also called as Hadoop Pseudo-Distributed Mode. The tutorial is very simple and to the point, so that you can install Hadoop in 10 Min. Once the installation is done you can perform Hadoop Distributed File System (HDFS) and Hadoop Map-Reduce operations.

Recommended Platform:

  • OS: Linux is supported as a development and production platform. You can use Ubuntu 14.04 or later (you can also use other Linux flavors like: CentOS, Redhat, etc.)
  • Hadoop: Cloudera Distribution for Apache hadoop CDH5.x (you can use Apache hadoop 2.x)


Install Java 7 (Recommended Oracle Java)

Install Python Software Properties

$sudo apt-get install python-software-properties

Add Repository

$sudo add-apt-repository ppa:webupd8team/java

Update the source list

$sudo apt-get update

Install Java

$sudo apt-get install oracle-java7-installer

Configure SSH

Install Open SSH Server-Client

$sudo apt-get install openssh-server openssh-client

Generate Key Pairs

$ssh-keygen -t rsa -P ""

3.2.3 Configure password-less SSH

$cat $HOME/.ssh/ >> $HOME/.ssh/authorized_keys

Check by SSH to localhost

$ssh localhost

Install Hadoop

Download Hadoop

Untar Tar ball

$tar xzf hadoop-2.5.0-cdh5.3.2.tar.gz
Note: All the required jars, scripts, configuration files, etc. are available in HADOOP_HOME directory (hadoop-2.5.0-cdh5.3.2)


  1. You have certainly explained that Big data analytics is the process of examining big data to uncover hidden patterns, unknown correlations and other useful information that can be used to make better decisions..The big data analytics is the major part to be understood regarding Hadoop Training in Chennai program. Via your quality content i get to know about that in deep. Thanks for sharing this here.

  2. This is the exact piece of information that I was searching for a long time(Hadoop Training in Chennai). Processing data is the biggest issue that every cloud based companies are facing worldwide(Best hadoop training institute in chennai). Handling this problem made easy with the introduction of big data. Thank you so much for your worth able content here. Keep Posting article like this(Big Data Training).

  3. Nice blog. Thank you for sharing. The information you shared is very effective for learners I have got some important suggestions from it. erp software in chennai.