How to Install Hadoop on Ubuntu
What is Hadoop?
Hadoop is the open source and java based framework.It is used to storing lage amount amount of data and having more components to accessing the data.In Hadoop installation java is most important because hadoop is java based framework.Here we are discuss about how to install hadoop on Ubuntu operating system.
Hadoop Having following three main layers
1.HDFS – Used to stores the Large amount of data that stored file system are runs on Hadoop cluster machines.
2.MapReduce – Used to Processing the large amount of data set in the form of key /value pair.
3.Yarn – Responsible for managing resources in cluster and scheduling applications.\
Steps to Install Java:
Step 1: Click here to download Java
Hadoop Programming are written in java so java installation are most important to hadoop.
Step 2: Comment for install java
Comment:
$ sudo apt-get update
$ sudo apt-get install openjdk-8-jre
$ sudo apt-get install openjdk-8-jdk
$ java -version
Now java installed on JAVA_HOME and variable available in bashrc file
Step 3: How to know where is java installed
Comment:
$ ls -l /etc/alternatives/javac
lrwxrwxrwx 1 root root 36 Nov 14 23:15 /etc/alternatives/javac -> /usr/lib/jvm/java-8-oracle/bin/javac
Step 4: Install SSH
- Main purposes of ssh is make a communication between hadoop components.
- If connect hadoop to the main host without password ,we are using ssh for this process
Comment:
$ sudo apt-get install ssh
$ ssh-keygen -t rsa -P “”
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
$ ssh localhost
Step 1: Click Here to download Hadoop
Step 2: Extract the Hadoop download folder
$ wget http://www-us.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
$ tar xvzf hadoop-2.7.3.tar.gz
$ sudo mkdir -p /usr/local/hadoop
$ cd hadoop-2.7.3/
$ sudo mv * /usr/local/hadoop
$ sudo chown -R hduser:hadoop /usr/local/hadoop
Following six files are most important to install the Hadoop
1..bashrc file
2.hadoop-env.sh file
3.core-site.xml file
4.mapred-site.xml file
5.hdfs-site.xml file
6.yarn-site.xml file
Step 3: configure .bashrc file
Comment:
export HADOOP_HOME=/usr/local/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=”-Djava.library.path=$HADOOP_HOME/lib”
Comment for update the .bashrc file
$ source ~/.bashrc
Step 4: Configure hadoop-env.sh file(used to export and set path for java)
Comment:
$ vim /usr/local/hadoop-2.7.3/etc/hadoop/hadoop-env.sh (path of java)
export JAVA_HOME=/usr/lib/jvm/java-8-oracle (export)
Step 5: Modify core-site.xml file
Comment:
$ vim /usr/local/hadoop-2.7.3/etc/hadoop/core-site.xml
Configure the core-site.xml file:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Step 6: hdfs-site.xml file
If you need configure the hdfs-site.xml file which has two node
1.Namenode
2.Datanode
These can be done using the following commands:
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
$ sudo chown -R hduser:hadoop /usr/local/hadoop_store
Step 7: Modify the hdfs-site.xml file
Comment:
$ vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
Configure the hdfs-site.xml file
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Step 8: Modify mapred-site.xml file
Comment:
$ vim /usr/local/hadoop-2.7.3/etc/hadoop/mapred-site.xml
Configure the mapred-site.xml file
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Step 9: Modify yarn-site.xml file
Comment:
$ vim /usr/local/hadoop-2.7.3/etc/hadoop/yarn-site.xml
Configure the yarn-site.xml file
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
Step 10: How to start hadoop:
Comment:
$ start-all.sh
Step 8: How to stop hadoop:
Comment:
$ stop-all.sh
Also Read
References:
Comments
Post a Comment