Apache Hadoop Oozie Tutorial
Introduction:
Oozie is mainly used to manages the hadoop jobs in HDFS and it combines the multiple jobs in particular order to achieve the big task. It is the open source framework and used to make multiple hadoop jobs. Oozie supports the jobs in mapreduce,hive and hdfs also. In Oozie job workflow based on Directed Acylic Graph and it contains two nodes for managing the jobs that nodes are action and control flow nodes.
Advantages of Oozie is it integrate with hadoop stack and also support mapreduce and hdfs jobs. Oozie contains following three types of jobs
1. Workflow jobs – It used to represents the sequence of jobs executed.
2. Coordinator Jobs – It contains workflow jobs and it triggered by time
3. Bundle Jobs – It contains the workflow and coordinator jobs
Types of Nodes in Apache Oozie:
Action Node – It represents the workflow jobs and jobs program are written in java
Control Flow Node – It used to controls the workflow jobs between actions
Start Node – It used to starts the jobs execution
End Node – It used to stops the jobs execution
Error Node – If any error occurs while execution of job error node prints the error message
Hadoop location – /home/hduser/hadoop
Step 1: Home directory Commands
$ pwd/home/hduser
Step 2: Download Oozie
$ wget http://supergsego.com/apache/oozie/3.3.2/oozie-3.3.2.tar.gz
Step 3: Untar
$ tar xvzf oozie-3.3.2.tar.gz
Step 4: Build Oozie
$ cd oozie-3.3.2/bin$ ./mkdistro.sh -DskipTests
Step 5: Oozie Server Setup
1. Copy the built binaries
$ cd ../../$ cp -R oozie-3.3.2/distro/target/oozie-3.3.2-distro/oozie-3.3.2/ oozie
2. Create Libext Directory
$ cd oozie$ mkdir libext
3. Copy all jar Commands
$ cp ../oozie-3.3.2/hadooplibs/target/oozie-3.3.2-hadooplibs.tar.gz .$ tar xzvf oozie-3.3.2-hadooplibs.tar.gz$ cp oozie-3.3.2/hadooplibs/hadooplib-1.1.1.oozie-3.3.2/* libext/
4. Update the Hadoop Files
<property><name>hadoop.proxyuser.hduser.hosts</name><value>localhost</value></property><property><name>hadoop.proxyuser.hduser.groups</name><value>hadoop</value></property>
Step 6: Creat Hadoop WAR file
$ ./bin/oozie-setup.sh prepare-war
setting CATALINA_OPTS=”$CATALINA_OPTS -Xmx1024m”
New Oozie WAR file with added ‘ExtJS library, JARs’ at /home/hduser/oozie/oozie-server/webapps/oozie.war
Step 7: Create Share library
$ ./bin/oozie-setup.sh sharelib create -fs hdfs://localhost:54310
setting CATALINA_OPTS=”$CATALINA_OPTS -Xmx1024m”
Step 8: Create Oozie DB
$ ./bin/ooziedb.sh create -sqlfile oozie.sql -runsetting CATALINA_OPTS=”$CATALINA_OPTS -Xmx1024m”Validate DB ConnectionDONECheck DB schema does not existDONECheck OOZIE_SYS table does not existDONECreate SQL schemaDONECreate OOZIE_SYS tableDONE
Step 9: Start a Oozie
$ ./bin/oozied.sh start
Step 10: Start Oozie at foreground
$ ./bin/oozied.sh run
Step 11: Check the Oozie Status
$ ./bin/oozie admin -oozie http://localhost:11000/oozie -statusSystem mode: NORMAL
Step 12: Setup the Oozie Client
$ cd ..$ cp oozie/oozie-client-3.3.2.tar.gz .$ tar xvzf oozie-client-3.3.2.tar.gz$ mv oozie-client-3.3.2 oozie-client$ cd bin
After installation of Oozie restart your terminal.
Are you Interested to learn Hadoop – Please Click Here
Reference - Oozie Tutorial
Comments
Post a Comment