Friday, March 4, 2011

Hadoop in Distributed Mode

This section contains instructions for Hadoop installation on ubuntu. This is Hadoop quickstart tutorial to setup Hadoop quickly. This is shortest tutorial of Hadoop installation, here you will get all the commands and their description required to install Hadoop in distributed mode(multi node cluster)

Prerequisite: Before starting hadoop in distributed mode you must setup hadoop in pseudo distributed mode and you need at least two machines one for master and another for slave(you can create more then one virtual machine on a single machine).




Following steps tested on:
OS: ubuntu
Hadoop: Apache Hadoop 0.20.X



Deploy Hadoop in Distributed Mode:
 
COMMAND DESCRIPTION
$ bin/stop-all.sh

Before starting hadoop in distributed mode first stop each cluster.

run this cmd on all machines in cluster (master and slave) 
$ vi /etc/hosts Then type
IP-add master(eg: 192.168.0.1 master)
IP-add slave(eg: 192.168.0.2 slave)

run this cmd on all machines in cluster (master and slave) 
$ ssh-copy-id -i $HOME/.ssh/id_rsa.pub slave setting passwordless ssh
(on all the machines you must login with same user name)

run this cmd on master
or
$ cat .ssh/id_rsa.pub
Then Its content is then copied in 
$ .ssh/authorized_keys file of the slave (system you wish to SSH to without being prompted for a password)
we can also set passwordless ssh manually

$ vi conf/master
then type master
The conf/masters file defines the namenodes of our multi-node cluster

run this cmd on master 

$ vi conf/slaves
then type slave
This conf/slaves file lists the hosts, one per line, where the Hadoop slave daemons (datanodes and tasktrackers) will be run.

run this cmd on all machines in cluster (master and slave)   

$ vi conf/core-site.xml
then type:
<property>
  <name>fs.default.name</name>
  <value>hdfs://master:54310</value>
 </property>

Edit configuration file core-site.xml



run this cmd on all machines in cluster (master and slave)  

$ vi conf/mapred-site.xml
then type:
<property>
  <name>mapred.job.tracker</name>
  <value>master:54311</value>
 </property>

Edit configuration file mapred-site.xml



run this cmd on all machines in cluster (master and slave)  
$ vi conf/hdfs-site.xml
then type:

<property>
  <name>dfs.replication</name>
  <value>2</value> </property>
Edit configuration file hdfs-site.xml


run this cmd on all machines in cluster (master and slave)  

$ vi conf/mapred-site.xml
then type:
<property>
  <name>mapred.local.dir</name>
  <value>${hadoop.tmp.dir}/mapred/local</value>
</property>

<property>
  <name>mapred.map.tasks</name>
  <value>20</value>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>2</value> 
</property>



Edit configuration file mapred-site.xml













run this cmd on master

$ bin/start-dfs.sh

Starting the multi-node cluster. First, the HDFS daemons are started. the namenode daemon is started on master, and datanode daemons are started on all slaves

run this cmd on master   

$ jps


It should give output like this:
14799 NameNode
15314 Jps
16977 secondaryNameNode


run this cmd on master 

$ jps



It should give output like this:
15183 DataNode
15616 Jps

run this cmd on all slaves

$ bin/start-mapred.sh

the MapReduce daemons are started: the jobtracker is started on master, and tasktracker daemons are started on all slaves

run this cmd on master 

$ jps


It should give output like this:
16017 Jps
14799 NameNode
15596 JobTracker
14977 SecondaryNameNode

run this cmd on master 

$ jps


It should give output like this:
15183 DataNode
15897 TaskTracker
16284 Jps



run this cmd on all slaves 
Congratulations Hadoop Setup is Completed
http://localhost:50070/ web based interface for name node
http://localhost:50030/ web based interface for job tracker
Now lets run some examples
$ bin/hadoop jar hadoop-*-examples.jar pi 10 100 run pi example
$ bin/hadoop dfs -mkdir input
$ bin/hadoop dfs -put conf input
$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
$ bin/hadoop dfs -cat output/*
run grep example
$ bin/hadoop dfs -mkdir inputwords
$ bin/hadoop dfs -put conf inputwords
$ bin/hadoop jar hadoop-*-examples.jar wordcount inputwords outputwords
$ bin/hadoop dfs -cat outputwords/*
run wordcount example

$ bin/stop-mapred.sh
$ bin/stop-dfs.sh
To stop the demons

run this cmd on master  

71 comments:

  1. hi ,you have gathered a valuable information on Hadoop...., and i am much impressed with the information and it is useful for Hadoop Learners.These blogs are valuable because these are providing such informative information for all the people.
    Hadoop Training in hyderabad

    ReplyDelete
  2. hello ,you have accumulated a profitable data on hadoop training in chennaiHadoop...., and i am quite awed with the data and it is hadoop training in chennai helpful for Hadoop Learners.These sites are significant on the grounds that these are giving such instructive data to all the individuals.

    ReplyDelete
  3. Configuring master & slave nodes
    Configuring SSH on all nodes (master & slaves)
    Install Java on all nodes (master & slaves)
    Installing & Configuring Hadoop all nodes (master & slaves)
    Modify Hadoop configuration files (master & slaves)
    Format Hadoop NameNode-
    Start Hadoop daemons
    Verify the daemons are running
    Verify UIs by namenode & job tracker

    Hadoop Training in Chennai

    ReplyDelete
  4. Oracle Training in chennai

    I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly..

    ReplyDelete
  5. Oracle Training in chennai

    Wonderful blog.. Thanks for sharing informative blog.. its very useful to me..

    ReplyDelete
  6. Informatica Training in chennai

    Thanks for sharing such a great information..Its really nice and informative..

    ReplyDelete
  7. Pega Training in Chennai

    This post is really nice and informative. The explanation given is really comprehensive and informative..

    ReplyDelete
  8. There are lots of information about latest technology and how to get trained in them, like Hadoop Training
    in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies Hadoop Training in Chennai By the way you are running a great blog. Thanks for sharing this.

    ReplyDelete
  9. SAS Training in Chennai
    Thanks for sharing this informative blog. I did SAS Certification in Greens Technology at Adyar. This is really useful for me to make a bright career..

    ReplyDelete
  10. Whatever we gathered information from the blogs, we should implement that in practically then only we can understand that exact thing clearly, but it’s no need to do it, because you have explained the concepts very well. It was crystal clear, keep sharing..

    QTP Training in Chennai

    ReplyDelete
  11. This information is impressive..I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic..

    Greens Technologies In Chennai

    ReplyDelete
  12. Pretty article! I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing..

    Greens Technologies In Chennai

    ReplyDelete
  13. If wants to get real time Oracle Training visit this blog They give professional and job oriented training for all students.To make it easier for you Greens Technologies trained as visualizing all the real-world Application and how to implement in Archiecture trained with expert trainners guide may you want.. Start brightening your career with us Green Technologies In Chennai

    ReplyDelete
  14. I also wanted to share few links related to sas training Check this sitete.if share indepth sas training.Go here if you’re looking for information on sas training. SAS Training in Chennai

    ReplyDelete
  15. This site has very useful inputs related to qtp.This page lists down detailed and information about QTP for beginners as well as experienced users of QTP. If you are a beginner, it is advised that you go through the one after the other as mentioned in the list. So let’s get started… QTP Training in Chennai,

    ReplyDelete
  16. Hi. Nice post. I am wondering if it is possible.Actually pega software that can be used in many companies for their day to day business activities it has great scope in future.if suggest best coaching center visit Pega Training in Chennai

    ReplyDelete
  17. Job oriented Hadoop training in Chennai is offered by our institute. Our training is mainly focused on real time and industry oriented. We provide training from beginner’s level to advanced level techniques thought by our experts.
    Hadoop Training in Chennai

    ReplyDelete
  18. Hey, nice site you have here!We provide world-class Oracle certification and placement training course as i wondered Keep up the excellent work !Please visit Greens Technologies located at Chennai Adyar Oracle Training in chennai

    ReplyDelete
  19. if more information about oracle training visit Oracle Training in chennai we provide profesional experts trained with real-time scenario and job oriended also certification training .We Guarantee Your Oracle Training Success in Chennai

    ReplyDelete
  20. GREENS TECHNOLOGIES, ONE OF THE BEST IT INSTITUTES FOR ORACLE SQL TRAINING IN CHENNAI OFFERS TRAINING WITH PRACTICAL GUIDANCE. OUR TRAINING ACADEMY IS FULLY EQUIPPED WITH SUPERIOR INFRASTRUCTURE AND LAB FACILITIES. WE ARE PROVIDING THE BEST ORACLE PLSQL TRAINING IN CHENNAI.

    ReplyDelete
  21. Greens Technologies Training In Chennai Excellent information with unique content and it is very useful to know about the information based on blogs

    ReplyDelete
  22. Thanks for sharing this nice useful informative post to our knowledge, Actually SAS used in many companies for their day to day business activities it has great scope in future.

    ReplyDelete
  23. Our HP Quick Test Professional course includes basic to advanced level and our QTP course is designed to get the placement in good MNC companies in chennai as quickly as once you complete the QTP certification training course.

    ReplyDelete
  24. A Best Pega Training course that is exclusively designed with Basics through Advanced Pega Concepts.With our Pega Training in Chennai you’ll learn concepts in expert level with practical manner.We help the trainees with guidance for Pega System Architect Certification and also provide guidance to get placed in Pega jobs in the industry.

    ReplyDelete
  25. Great post and informative blog.it was awesome to read, thanks for sharing this great content to my vision.Informatica Training In Chennai

    ReplyDelete
  26. There are lots of information about latest technology and how to get trained in them, like Hadoop Training Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Hadoop Training in Chennai). By the way you are running a great blog. Thanks for sharing this

    ReplyDelete
  27. Oracle Training in Chennai is one of the best oracle training institute in Chennai which offers complete Oracle training in Chennai by well experienced Oracle Consultants having more than 12+ years of IT experience.

    ReplyDelete
  28. It is really very helpful for us and I have gathered some important information from this blog.Oracle Training In Chennai

    ReplyDelete

  29. Nice site....Please refer this site also nice if Our vision succes!Training are focused on perfect improvement of technical skills for Freshers and working professional. Our Training classes are sure to help the trainee with COMPLETE PRACTICAL TRAINING and Realtime methodologies Green Technologies In Chennai

    ReplyDelete
  30. I would recommend the Qlikview course to anyone interested in learning Business Intelligence .Absolutely professional and engaging training sessions helped me to appreciate and understand the technology better. thank you very much if our dedicated efforts and valuable insights which made it easy for me to understand the concepts taught and more ... qlikview Training in chennai

    ReplyDelete
  31. Thanks for sharing this informative blog .To make it easier for you Greens Techonologies at Chennai is visualizing all the materials about (OBIEE).SO lets Start brightening your future.and using modeling tools how to prepare and build objects and metadata to be used in reports and more trained itself visit Obiee Training in chennai

    ReplyDelete
  32. Looking for real-time training institue.Get details now may if share this link visit
    Spring Training in chennai
    oraclechennai.in:

    ReplyDelete
  33. hybernet is a framework Tool which helps in Functional and Regression testing of an application. If you are interested in hybernet training, our real time working.
    Hibernate Training in Chennai,hibernate training in Chennai

    ReplyDelete
  34. Nice site.... refer this site .if Our vision succes!Training are focused on perfect improvement of technical skills for Freshers and working professional. Our Training classes are sure to help the trainee with COMPLETE PRACTICAL TRAINING and Realtime methodologies.
    Oracle Rac Training Chennai
    haddoop:

    ReplyDelete
  35. Job oriented form_reports training in Chennai is offered by our institue is mainly focused on real time and industry oriented. We provide training from beginner’s level to advanced level techniques thought by our experts.
    forms-reports Training in Chennai

    ReplyDelete
  36. hai,i have to learned to lot of information about java Gain the knowledge and hands-on experience you need to successfully design, build and deploy applications with java.
    Java Training in Chennai

    ReplyDelete
  37. hai you have to learned to lot of information about c# .net Gain the knowledge and hands-on experience you need to successfully design, build and deploy applications with c#.net.
    C-Net-training-in-chennai

    ReplyDelete
  38. hai If you are interested in asp.net training, our real time working.
    asp.net Training in Chennai.
    Asp-Net-training-in-chennai.html

    ReplyDelete
  39. Amazing blog if our training additional way as an silverlight training trained as individual, you will be able to understand other applications more quickly and continue to build your skill set which will assist you in getting hi-tech industry jobs as possible in future courese of action..visit this blog
    silverlight-training.html
    greenstechnologies.in:

    ReplyDelete
  40. Make a List: First start with creating a history of all the genuine products. When you know what you have to shift, aspects simply themselves. Movers and Packers Hyderabad
    Movers and Packers Bangalore
    Movers and Packers Pune
    Movers and Packers Mumbai

    ReplyDelete
  41. Build a easy history and create down everything. When you are going to system in bins, place a number on every box and create the material on your history. This will create unpacking much simpler and reorganization simpler. Movers and Packers in Chennai
    Movers and Packers in Delhi
    Movers and Packers in Gurgaon
    Movers and Packers in Noida

    ReplyDelete
  42. Latest Govt Bank Railway Jobs 2016

    Very interesting thanks. I believe there's even more that could be on there! Keep it up......................

    ReplyDelete
  43. I get a lot of great information from this blog. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.
    sas online training

    ReplyDelete
  44. It seems there is no difference between the subject mentioned at this blog and hadoop online training center. Thanks for presenting the information in an excellent way.

    ReplyDelete
  45. I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly.
    informatica training in chennai

    ReplyDelete

  46. Really awesome blog. Your blog is really useful for me. Thanks for sharing this informative blog. Keep update your blog.
    QTP Training in Chennai

    ReplyDelete
  47. Latest Govt Bank Jobs Notification 2016

    Nice Post, Thanks to author for sharing useful information

    ReplyDelete
  48. Latest Govt Bank Jobs Recruitment Notification 2016


    Thanks so much for the article post.Really thank you! Will read on…...............

    ReplyDelete
  49. In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Standard Edition, provided the nodes are clustered using Oracle Clusterware.
    Oracle RAC allows multiple computers to run Oracle RDBMS software simultaneously while accessing a single database, thus providing clustering.

    In a non-RAC Oracle database, a single instance accesses a single database. The database consists of a collection of data files, control files, and redo logs located on disk. The instance comprises the collection of Oracle-related memory and operating system processes that run on a computer system.

    Oracle RAC Training in Chennai

    ReplyDelete
  50. Thanks for sharing this informative blog..
    J2EE Training In Chennai

    ReplyDelete
  51. Naval Dockyard Visakhapatnam Tradesman Skilled Recruitment 2016

    This is awesome blog with smart content, Nice to see your post. Thanks.............

    ReplyDelete
  52. Haryana HSSC Steno Typist Recruitment 2016

    Thanks for providing valuable information in this article by author......................

    ReplyDelete
  53. Such a informative post.Thanks for sharing your knowledge with us.keep it up for updating post..
    http://sonymobileservicecenterinchennai.in/AboutUs.html

    ReplyDelete
  54. The hadoop was the very excellent more information to get after refer this post and then the very easy article to given this post.
    hadoop training in chennai

    ReplyDelete
  55. There are lots of information about latest technology and how to get trained in them, like this have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies. By the way you are running a great blog. Thanks for sharing this.

    Hadoop Training in Chennai

    ReplyDelete
  56. Nice post. I learned something knowledgeable. Thanks for sharing.

    IELTS classes in Kuwait

    ReplyDelete
  57. Wonderful article. very interesting to read and I learned more new things. Thanks for sharing.

    web design training institute in Chennai

    ReplyDelete
  58. This document describes how to set up and configure a single-node Hadoop installation so that you can quickly perform simple operations using Hadoop MapReduce and the Hadoop Distributed File System (HDFS)
    Android Training in Chennai

    ReplyDelete
  59. nice blog,it was helpful to got more new ideas about hadoop technology.thanks for giving a nice information,this kind of valuable information is useful to improve our career and to done a real time projects.it shares more new ideas with us.
    CCNA Training in Chennai

    ReplyDelete
  60. I learn some how about hadoops which is interesting and unique , its commands are especially awesome for taking notes thanks for posting this blog.


    dot net training in chennai

    ReplyDelete
  61. Best packers and movers in Panchkula, household shifting service in Panchkula, home relocation in Panchkula, car carriers service in Panchkula, car transport in Panchkula. packers and movers Panchkula

    ReplyDelete
  62. You have clearly explained about hadoop concept..Its very useful to understand in depth about distributed hadoop concept..Got a clear idea about hadoop..Keep on sharing more hadoop concepts..
    Linux training in chennai

    ReplyDelete
  63. This hadoop technology in case of distributed mode is very useful for me.It is a nice thing.It is easily capture the concept.
    Websphere MQ Training in Chennai

    ReplyDelete
  64. its really good to know about article when using hadoop software and what are the advantages in that. thanks for sharing keep update.
    Data warehousing Training in Chennai

    ReplyDelete
  65. Nice...You have clearly explained about hadoop concept..Its very useful to understand in depth about distributed hadoop concept..Got a clear idea about hadoop..Keep on sharing more hadoop concepts..
    weblogic training in chennai

    ReplyDelete
  66. Really amazing post thanks for helpful sharing with great apps Video Calls App

    ReplyDelete

  67. Thanks for sharing with us that awesome article you have amazing blog.....
    salesforce training in hyderabad

    ReplyDelete
  68. obiee training in Hyderabad
    we are offering best Obiee online training with work support and job assistance and high quality training facilities and well expert faculty
    for other details and register your demo contact
    obiee online training
    Institute : Vguru Online Trainings
    Contact number: +91-9885681705
    Email Id: vguruonline@gmail.com

    ReplyDelete