Showing posts with label Optimization. Show all posts
Showing posts with label Optimization. Show all posts

Optimize Map Reduce Job Performance

Optimize Hadoop Performance. To improve Hadoop performance, you need to change various configuration parameter in core-site.xml, hdfs-site.xml, mapred-site.xml. The configuration / optimization of parameter to improve performance depends on the type of processing, it depends on case to case, there is no hard and fast rule.

To install Hadoop on ubuntu cluster you can refer this post

We can change block size, number of mappers and reducers, sort factor, jvm reuse, memory for java process, enable compression, map output compression, use combiner, etc.
I found a very nice description given by Cloudera



Hue Installation and Configuration

This section describes instructions for cloudera Hue installation and change its default configuration like configure other database with hue and send notification/email of job completion etc...


Installing Hue on one machine with CDH in pseudo-distributed mode:


To Install Hue:
  • With this single command hue will get installed

$ sudo apt-get install hue